Collections
Discover the best community collections!
Collections including paper arxiv:2402.13598
-
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 39 -
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems
Paper • 2311.11315 • Published • 6 -
Alignment for Honesty
Paper • 2312.07000 • Published • 11 -
Steering Llama 2 via Contrastive Activation Addition
Paper • 2312.06681 • Published • 11
-
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 17 -
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper • 2401.02412 • Published • 36 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 44 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 21