flow2023
's Collections
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper
•
2401.17464
•
Published
•
17
Divide and Conquer: Language Models can Plan and Self-Correct for
Compositional Text-to-Image Generation
Paper
•
2401.15688
•
Published
•
11
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper
•
2401.15024
•
Published
•
69
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on
Generalizability, Trustworthiness and Causality through Four Modalities
Paper
•
2401.15071
•
Published
•
35
Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation
for Generative AI
Paper
•
2401.14019
•
Published
•
21
ChatQA: Building GPT-4 Level Conversational QA Models
Paper
•
2401.10225
•
Published
•
34
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper
•
2402.16837
•
Published
•
24
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Paper
•
2402.15000
•
Published
•
22
Linear Transformers with Learnable Kernel Functions are Better
In-Context Models
Paper
•
2402.10644
•
Published
•
79
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM
Workflows
Paper
•
2402.10379
•
Published
•
30
Chain-of-Thought Reasoning Without Prompting
Paper
•
2402.10200
•
Published
•
104
Generative Representational Instruction Tuning
Paper
•
2402.09906
•
Published
•
53
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper
•
2402.03620
•
Published
•
114
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning
Tasks
Paper
•
2402.04248
•
Published
•
30
Rethinking Optimization and Architecture for Tiny Language Models
Paper
•
2402.02791
•
Published
•
12
OLMo: Accelerating the Science of Language Models
Paper
•
2402.00838
•
Published
•
82
Can Large Language Models Understand Context?
Paper
•
2402.00858
•
Published
•
22
Can large language models explore in-context?
Paper
•
2403.15371
•
Published
•
32
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language
Models
Paper
•
2406.04271
•
Published
•
28
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts
Language Models
Paper
•
2406.06563
•
Published
•
17
Instruction Pre-Training: Language Models are Supervised Multitask
Learners
Paper
•
2406.14491
•
Published
•
86
Unlocking Continual Learning Abilities in Language Models
Paper
•
2406.17245
•
Published
•
28
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning
Paper
•
2406.15334
•
Published
•
8
A Closer Look into Mixture-of-Experts in Large Language Models
Paper
•
2406.18219
•
Published
•
15
Characterizing Prompt Compression Methods for Long Context Inference
Paper
•
2407.08892
•
Published
•
9
Self-Training with Direct Preference Optimization Improves
Chain-of-Thought Reasoning
Paper
•
2407.18248
•
Published
•
32
jina-embeddings-v3: Multilingual Embeddings With Task LoRA
Paper
•
2409.10173
•
Published
•
28