-
Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering
Paper • 2204.04581 • Published • 1 -
Retrieval-Augmented Multimodal Language Modeling
Paper • 2211.12561 • Published • 1 -
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
Paper • 2212.10511 • Published • 1 -
Memorizing Transformers
Paper • 2203.08913 • Published • 2
Collections
Discover the best community collections!
Collections including paper arxiv:2305.00833
-
Ada-Instruct: Adapting Instruction Generators for Complex Reasoning
Paper • 2310.04484 • Published • 5 -
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
Democratizing Reasoning Ability: Tailored Learning from Large Language Model
Paper • 2310.13332 • Published • 14
-
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
Paper • 2310.15123 • Published • 7 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper • 2310.13671 • Published • 18 -
Self-Convinced Prompting: Few-Shot Question Answering with Repeated Introspection
Paper • 2310.05035 • Published • 1 -
Chain-of-Thought Reasoning is a Policy Improvement Operator
Paper • 2309.08589 • Published • 1
-
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs
Paper • 2310.13961 • Published • 4 -
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
Paper • 2202.07922 • Published • 1 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper • 2310.13671 • Published • 18 -
Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs
Paper • 2309.09582 • Published • 4
-
Pretraining-Based Natural Language Generation for Text Summarization
Paper • 1902.09243 • Published • 2 -
Learning to Reason and Memorize with Self-Notes
Paper • 2305.00833 • Published • 4 -
Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise
Paper • 2212.11685 • Published • 2 -
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
Paper • 2408.16293 • Published • 25
-
Measuring the Effects of Data Parallelism on Neural Network Training
Paper • 1811.03600 • Published • 2 -
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Paper • 1804.04235 • Published • 2 -
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Paper • 1905.11946 • Published • 3 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 62
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 18 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 19 -
Learning to Reason and Memorize with Self-Notes
Paper • 2305.00833 • Published • 4