LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published 23 days ago • 59
V^3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians Paper • 2409.13648 • Published Sep 20, 2024 • 10
LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation Paper • 2408.13252 • Published Aug 23, 2024 • 25
LLM-AD: Large Language Model based Audio Description System Paper • 2405.00983 • Published May 2, 2024 • 20