MMFactory: A Universal Solution Search Engine for Vision-Language Tasks Paper • 2412.18072 • Published 13 days ago • 14
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models Paper • 2412.18605 • Published 12 days ago • 17
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization Paper • 2412.21037 • Published 7 days ago • 21