Dinozorus
's Collections
The Unreasonable Ineffectiveness of the Deeper Layers
Paper
•
2403.17887
•
Published
•
78
Mixture-of-Depths: Dynamically allocating compute in transformer-based
language models
Paper
•
2404.02258
•
Published
•
104
ReFT: Representation Finetuning for Language Models
Paper
•
2404.03592
•
Published
•
91
Direct Nash Optimization: Teaching Language Models to Self-Improve with
General Preferences
Paper
•
2404.03715
•
Published
•
60
Better & Faster Large Language Models via Multi-token Prediction
Paper
•
2404.19737
•
Published
•
73
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper
•
2405.09818
•
Published
•
126
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model
Series
Paper
•
2405.19327
•
Published
•
46
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs
with Nothing
Paper
•
2406.08464
•
Published
•
65
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Paper
•
2407.01370
•
Published
•
86
Searching for Best Practices in Retrieval-Augmented Generation
Paper
•
2407.01219
•
Published
•
11
DoLa: Decoding by Contrasting Layers Improves Factuality in Large
Language Models
Paper
•
2309.03883
•
Published
•
34
Lynx: An Open Source Hallucination Evaluation Model
Paper
•
2407.08488
•
Published
Scaling LLM Test-Time Compute Optimally can be More Effective than
Scaling Model Parameters
Paper
•
2408.03314
•
Published
•
53
Writing in the Margins: Better Inference Pattern for Long Context
Retrieval
Paper
•
2408.14906
•
Published
•
138
Human Feedback is not Gold Standard
Paper
•
2309.16349
•
Published
•
5
Paper
•
2410.05258
•
Published
•
168
Paper
•
2410.01201
•
Published
•
48