From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning Paper • 2410.06456 • Published Oct 9, 2024 • 36
ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion Paper • 2403.18818 • Published Mar 27, 2024 • 24
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27, 2024 • 190
Deconstructing Denoising Diffusion Models for Self-Supervised Learning Paper • 2401.14404 • Published Jan 25, 2024 • 17