Collections
Discover the best community collections!
Collections including paper arxiv:2403.17297
-
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 115 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 76 -
Larimar: Large Language Models with Episodic Memory Control
Paper • 2403.11901 • Published • 33 -
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 51
-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 41 -
Qwen Technical Report
Paper • 2309.16609 • Published • 35 -
GPT-4 Technical Report
Paper • 2303.08774 • Published • 5 -
Gemini: A Family of Highly Capable Multimodal Models
Paper • 2312.11805 • Published • 45
-
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs
Paper • 2210.14986 • Published • 5 -
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Paper • 2311.10702 • Published • 19 -
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 75 -
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting
Paper • 2309.04269 • Published • 32
-
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Paper • 2309.09958 • Published • 18 -
TextBind: Multi-turn Interleaved Multimodal Instruction-following
Paper • 2309.08637 • Published • 7 -
Improved Baselines with Visual Instruction Tuning
Paper • 2310.03744 • Published • 37 -
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 17