File size: 930 Bytes
9d751d0
 
 
0976385
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
---
license: mit
---

# Model Card for LGVI

## Dataset Description
- **Paper:** https://arxiv.org/abs/2401.10226
- **Project Page:** https://jianzongwu.github.io/projects/rovi
- **Github Repository:** https://github.com/jianzongwu/Language-Driven-Video-Inpainting

### Model Summary

The LGVI model is trained on [ROVI](https://huggingface.co/datasets/jianzongwu/rovi) and [Inst-Inpaint](https://github.com/abyildirim/inst-inpaint) for the referring inpainting task. Please check our [project page](https://jianzongwu.github.io/projects/rovi) for more details.

```
@article{wu2024lgvi,
  title={Towards language-driven video inpainting via multimodal large language models},
  author={Wu, Jianzong and Li, Xiangtai and Si, Chenyang and Zhou, Shangchen and Yang, Jingkang and Zhang, Jiangning and Li, Yining and Chen, Kai and Tong, Yunhai and Liu, Ziwei and others},
  journal={arXiv preprint arXiv:2401.10226},
  year={2024}
}