🤖 Agents
Paper • 2310.03714 • Published • 32Note I'm not a fan of the implementation, but I think the ideas behind DSPy are interesting.
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 37
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
Paper • 2308.08155 • Published • 3Note The paper that introduced the concept of multi-agents!
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 186Note GAIA benchmark is the most challenging benchmark for generalist agents, requiring a good web browser, multimodal capabilities, and complex multi-step task solving.
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 114OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Paper • 2402.07456 • Published • 41Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 4Gorilla: Large Language Model Connected with Massive APIs
Paper • 2305.15334 • Published • 5MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
Paper • 2303.11381 • Published • 2HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
Paper • 2303.17580 • Published • 10Communicative Agents for Software Development
Paper • 2307.07924 • Published • 4More Agents Is All You Need
Paper • 2402.05120 • Published • 51
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 15Note This paper is the basis for the Thought -> Action -> Observation cycle used in most agent frameworks nowadays.
Executable Code Actions Elicit Better LLM Agents
Paper • 2402.01030 • Published • 30Note Has nice explanations as to why writing agent actions in code is better.
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
Paper • 2310.06770 • Published • 4- Running105🤔📊
Agent Data Analyst
Need to analyze data? Let a Llama-3.1 agent do it for you!
DynaSaur: Large Language Agents Beyond Predefined Actions
Paper • 2411.01747 • Published • 20ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Paper • 2411.17465 • Published • 76
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Paper • 2412.04454 • Published • 54Note This paper displays much more impressive scores than ShowUI : but the VLMs used are also much larger (7B and 72B vs 2B) and based on the better Qwen2.5 instead of Qwen2.
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents
Paper • 2401.00812 • Published • 4