Mechanistic Permutability: Match Features Across Layers Paper • 2410.07656 • Published Oct 10, 2024 • 17
Eliminating Position Bias of Language Models: A Mechanistic Approach Paper • 2407.01100 • Published Jul 1, 2024 • 6
TabReD: A Benchmark of Tabular Machine Learning in-the-Wild Paper • 2406.19380 • Published Jun 27, 2024 • 47
On scalable oversight with weak LLMs judging strong LLMs Paper • 2407.04622 • Published Jul 5, 2024 • 11
Learning to (Learn at Test Time): RNNs with Expressive Hidden States Paper • 2407.04620 • Published Jul 5, 2024 • 27
Aya 23: Open Weight Releases to Further Multilingual Progress Paper • 2405.15032 • Published May 23, 2024 • 27
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization Paper • 2405.15071 • Published May 23, 2024 • 37
Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning Paper • 2405.17258 • Published May 27, 2024 • 14
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models Paper • 2405.17428 • Published May 27, 2024 • 17
Transformers Can Do Arithmetic with the Right Embeddings Paper • 2405.17399 • Published May 27, 2024 • 52
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Paper • 2405.14333 • Published May 23, 2024 • 37
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22, 2024 • 126
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models Paper • 2404.12387 • Published Apr 18, 2024 • 38
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples Paper • 2404.07544 • Published Apr 11, 2024 • 19