|
--- |
|
license: mit |
|
--- |
|
|
|
# Model Card for LGVI |
|
|
|
## Dataset Description |
|
- **Paper:** https://arxiv.org/abs/2401.10226 |
|
- **Project Page:** https://jianzongwu.github.io/projects/rovi |
|
- **Github Repository:** https://github.com/jianzongwu/Language-Driven-Video-Inpainting |
|
|
|
### Model Summary |
|
|
|
The LGVI model is trained on [ROVI](https://huggingface.co/datasets/jianzongwu/rovi) and [Inst-Inpaint](https://github.com/abyildirim/inst-inpaint) for the referring inpainting task. Please check our [project page](https://jianzongwu.github.io/projects/rovi) for more details. |
|
|
|
``` |
|
@article{wu2024lgvi, |
|
title={Towards language-driven video inpainting via multimodal large language models}, |
|
author={Wu, Jianzong and Li, Xiangtai and Si, Chenyang and Zhou, Shangchen and Yang, Jingkang and Zhang, Jiangning and Li, Yining and Chen, Kai and Tong, Yunhai and Liu, Ziwei and others}, |
|
journal={arXiv preprint arXiv:2401.10226}, |
|
year={2024} |
|
} |
|
|
|
|