FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model Paper • 2410.13925 • Published Oct 17, 2024 • 23
SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields Paper • 2408.06697 • Published Aug 13, 2024 • 14
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding Paper • 2401.09340 • Published Jan 17, 2024 • 19