metadata

license: apache-2.0
language:
  - en
pipeline_tag: image-to-video
datasets:
  - BestWishYsh/ConsisID-preview-Data
base_model:
  - THUDM/CogVideoX-5b
  - THUDM/CogVideoX1.5-5B-I2V
base_model_relation: finetune
library_name: diffusers
tags:
  - IPT2V

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

If you like our project, please give us a star ⭐ on GitHub for the latest update.

😍 Gallery

Identity-Preserving Text-to-Video Generation. or you can click here to watch the video.

Description

Repository: Code, Page, Data
Paper: arxiv.org/abs/2411.17440
Point of Contact: Shenghai Yuan

✏️ Citation

If you find our paper and code useful in your research, please consider giving a star and citation.

@article{yuan2024identity,
  title={Identity-Preserving Text-to-Video Generation by Frequency Decomposition},
  author={Yuan, Shenghai and Huang, Jinfa and He, Xianyi and Ge, Yunyuan and Shi, Yujun and Chen, Liuhan and Luo, Jiebo and Yuan, Li},
  journal={arXiv preprint arXiv:2411.17440},
  year={2024}
}