Deliberation in Latent Space via Differentiable Cache Augmentation Paper • 2412.17747 • Published 30 days ago • 29
Selecting Influential Samples for Long Context Alignment via Homologous Models' Guidance and Contextual Awareness Measurement Paper • 2410.15633 • Published Oct 21, 2024 • 7
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation Paper • 2410.01912 • Published Oct 2, 2024 • 14
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents Paper • 2407.00114 • Published Jun 27, 2024 • 12
Anything in Any Scene: Photorealistic Video Object Insertion Paper • 2401.17509 • Published Jan 30, 2024 • 17