metadata
license: apache-2.0
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
4-step Text-to-video Generation
With the style of low-poly game art, A majestic, white horse gallops gracefully across a moonlit beach. | medium shot of Christine, a beautiful 25-year-old brunette resembling Selena Gomez, anxiously looking up as she walks down a New York street, cinematic style | a cartoon pig playing his guitar, Andrew Warhol style |
a dog wearing vr goggles on a boat | Pikachu snowboarding | a girl floating underwater |
Model description π
This repository contains unet_lora.pt
that can turn VideoCrafter2 into our T2V-Turbo (VC2). Our T2V-Turbo (VC2) can achieve both fast and high-quality T2V generation. On VBench, the 4-step generation from our T2V-Turbo (VC2) even outperform proprietary systems, including Gen-2 and Pika. Please refer to our GitHub repo for detailed instructions.
Usage π
This checkpoint is obtained by merging the UNet LoRA weight to the UNet of VideoCrafter2. Therefore, the checkpoint here is also under the apache-2.0 license.
You need to first clone our GitHub repo. Here are the codes to load the checkpoint.
from utils.common_utils import load_model_checkpoint
from utils.utils import instantiate_from_config
config = OmegaConf.load("configs/inference_t2v_512_v2.0.yaml")
model_config = config.pop("model", OmegaConf.create())
pretrained_t2v = instantiate_from_config(model_config)
unet_config = model_config["params"]["unet_config"]
unet_config["params"]["time_cond_proj_dim"] = 256
unet = instantiate_from_config(unet_config)
pretrained_t2v.model.diffusion_model = unet
pretrained_t2v = load_model_checkpoint(pretrained_t2v, "checkpoints/t2v_turbo_vc2.pt")
Misuse, Malicious Use and Excessive Use π
Our model is meant for research purposes.
- It is prohibited to generate content that is demeaning or harmful to people or their environment, culture, religion, etc.
- Prohibited for pornographic, violent and bloody content generation.
- Prohibited for error and false information generation.