Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper โข 2412.15322 โข Published 13 days ago โข 16
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper โข 2412.15322 โข Published 13 days ago โข 16
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper โข 2412.15322 โข Published 13 days ago โข 16 โข 2
Byte Latent Transformer: Patches Scale Better Than Tokens Paper โข 2412.09871 โข Published 19 days ago โข 79
Putting the Object Back into Video Object Segmentation Paper โข 2310.12982 โข Published Oct 19, 2023