Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models Paper • 2406.09416 • Published Jun 13, 2024 • 27
An Image is Worth 32 Tokens for Reconstruction and Generation Paper • 2406.07550 • Published Jun 11, 2024 • 57
COCONut Dataset Collection This is a collection of COCONut datasets accepted at CVPR2024 • 3 items • Updated Apr 29, 2024 • 4
ViTamin Family Collection Designing Scalable Vision Models in the Vision-language Era. The best performing model is 'jienengchen/ViTamin-XL-384px'. • 16 items • Updated Apr 11, 2024 • 8
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP Paper • 2308.02487 • Published Aug 4, 2023 • 12