-
Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering
Paper • 2204.04581 • Published • 1 -
Large Language Models (GPT) Struggle to Answer Multiple-Choice Questions about Code
Paper • 2303.08033 • Published • 1 -
CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering
Paper • 2305.14869 • Published • 1 -
Multi-hop Commonsense Knowledge Injection Framework for Zero-Shot Commonsense Question Answering
Paper • 2305.05936 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2401.10225
-
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper • 2204.07705 • Published • 1 -
Knowledge-Driven CoT: Exploring Faithful Reasoning in LLMs for Knowledge-intensive Question Answering
Paper • 2308.13259 • Published • 2 -
MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Paper • 2309.05653 • Published • 10 -
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Paper • 2309.12284 • Published • 19
-
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs
Paper • 2310.13961 • Published • 4 -
Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs
Paper • 2309.09582 • Published • 4 -
Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models
Paper • 2310.13127 • Published • 11 -
Evaluating the Robustness to Instructions of Large Language Models
Paper • 2308.14306 • Published • 1
-
How "Real" is Your Real-Time Simultaneous Speech-to-Text Translation System?
Paper • 2412.18495 • Published • 8 -
Ultra-Sparse Memory Network
Paper • 2411.12364 • Published • 19 -
Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries
Paper • 2402.13043 • Published • 2 -
Agent Workflow Memory
Paper • 2409.07429 • Published • 28
-
Video Creation by Demonstration
Paper • 2412.09551 • Published • 8 -
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Paper • 2412.07589 • Published • 46 -
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Paper • 2412.06531 • Published • 71 -
APOLLO: SGD-like Memory, AdamW-level Performance
Paper • 2412.05270 • Published • 38
-
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 17 -
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation
Paper • 2401.15688 • Published • 11 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 69 -
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Paper • 2401.15071 • Published • 35
-
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 53 -
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Paper • 2401.10774 • Published • 54 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 145 -
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Paper • 2401.12954 • Published • 29