Design2Code: How Far Are We From Automating Front-End Engineering? Paper • 2403.03163 • Published Mar 5, 2024 • 93
Beyond Language Models: Byte Models are Digital World Simulators Paper • 2402.19155 • Published Feb 29, 2024 • 49
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6, 2024 • 61
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 183
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks Paper • 2403.00522 • Published Mar 1, 2024 • 44
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27, 2024 • 190
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27, 2024 • 605
In deep reinforcement learning, a pruned network is a good network Paper • 2402.12479 • Published Feb 19, 2024 • 18
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper • 2402.13753 • Published Feb 21, 2024 • 114
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss Paper • 2402.10790 • Published Feb 16, 2024 • 41
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information Paper • 2402.13616 • Published Feb 21, 2024 • 46
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data Paper • 2402.08093 • Published Feb 12, 2024 • 57
Mixtures of Experts Unlock Parameter Scaling for Deep RL Paper • 2402.08609 • Published Feb 13, 2024 • 34
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement Paper • 2402.07456 • Published Feb 12, 2024 • 42
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Paper • 2402.07827 • Published Feb 12, 2024 • 45