Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published about 1 month ago • 123
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models Paper • 2312.02896 • Published Dec 5, 2023 • 1
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error Paper • 2403.04746 • Published Mar 7, 2024 • 22
SDXL-Lightning: Progressive Adversarial Diffusion Distillation Paper • 2402.13929 • Published Feb 21, 2024 • 27
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper • 2402.13753 • Published Feb 21, 2024 • 114
World Model on Million-Length Video And Language With RingAttention Paper • 2402.08268 • Published Feb 13, 2024 • 37
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models Paper • 2312.02896 • Published Dec 5, 2023 • 1
PALP: Prompt Aligned Personalization of Text-to-Image Models Paper • 2401.06105 • Published Jan 11, 2024 • 47