LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 112
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction? Paper • 2411.06469 • Published Nov 10, 2024 • 17
Sharingan: Extract User Action Sequence from Desktop Recordings Paper • 2411.08768 • Published Nov 13, 2024 • 10
AnimateAnything: Consistent and Controllable Animation for Video Generation Paper • 2411.10836 • Published Nov 16, 2024 • 23
SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory Paper • 2411.11922 • Published Nov 18, 2024 • 18