zzfive
's Collections
AgentOhana: Design Unified Data and Training Pipeline for Effective
Agent Learning
Paper
•
2402.15506
•
Published
•
14
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web
Navigating Agent
Paper
•
2404.03648
•
Published
•
24
Similarity is Not All You Need: Endowing Retrieval Augmented Generation
with Multi Layered Thoughts
Paper
•
2405.19893
•
Published
•
30
Parrot: Efficient Serving of LLM-based Applications with Semantic
Variable
Paper
•
2405.19888
•
Published
•
6
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective
Navigation via Multi-Agent Collaboration
Paper
•
2406.01014
•
Published
•
31
AgentGym: Evolving Large Language Model-based Agents across Diverse
Environments
Paper
•
2406.04151
•
Published
•
17
τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World
Domains
Paper
•
2406.12045
•
Published
•
6
Agentless: Demystifying LLM-based Software Engineering Agents
Paper
•
2407.01489
•
Published
•
42
Internet of Agents: Weaving a Web of Heterogeneous Agents for
Collaborative Intelligence
Paper
•
2407.07061
•
Published
•
27
Spider2-V: How Far Are Multimodal Agents From Automating Data Science
and Engineering Workflows?
Paper
•
2407.10956
•
Published
•
6
Sibyl: Simple yet Effective Agent Framework for Complex Real-world
Reasoning
Paper
•
2407.10718
•
Published
•
17
POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation
Paper
•
2407.14931
•
Published
•
21
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
Paper
•
2407.15711
•
Published
•
9
CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis
Paper
•
2407.13301
•
Published
•
56
OpenDevin: An Open Platform for AI Software Developers as Generalist
Agents
Paper
•
2407.16741
•
Published
•
68
LAMBDA: A Large Model Based Data Agent
Paper
•
2407.17535
•
Published
•
35
AppWorld: A Controllable World of Apps and People for Benchmarking
Interactive Coding Agents
Paper
•
2407.18901
•
Published
•
33
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Paper
•
2407.20183
•
Published
•
41
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS
Paper
•
2408.01584
•
Published
•
7
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in
Long-Horizon Tasks
Paper
•
2408.03615
•
Published
•
30
CodexGraph: Bridging Large Language Models and Code Repositories via
Code Graph Databases
Paper
•
2408.03910
•
Published
•
15
Automated Design of Agentic Systems
Paper
•
2408.08435
•
Published
•
38
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized
Academic Assistance
Paper
•
2409.04593
•
Published
•
23
Paper
•
2409.07429
•
Published
•
28
SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research
Repositories
Paper
•
2409.07440
•
Published
•
6
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks
at Scale
Paper
•
2409.16299
•
Published
•
10
MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for
Superior Planning and Decision-Making
Paper
•
2409.16686
•
Published
•
10
Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise
Paper
•
2410.03017
•
Published
•
26
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Paper
•
2410.08164
•
Published
•
24
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language
Models
Paper
•
2410.11710
•
Published
•
19
Agent-as-a-Judge: Evaluate Agents with Agents
Paper
•
2410.10934
•
Published
•
18
Revealing the Barriers of Language Agents in Planning
Paper
•
2410.12409
•
Published
•
25
MobA: A Two-Level Agent System for Efficient Mobile Task Automation
Paper
•
2410.13757
•
Published
•
32
Web Agents with World Models: Learning and Leveraging Environment
Dynamics in Web Navigation
Paper
•
2410.13232
•
Published
•
41
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized
Generalist Computer Assistant
Paper
•
2410.18603
•
Published
•
32
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science
Competitions
Paper
•
2410.20424
•
Published
•
39
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World
Exploration, Feedback and Optimization
Paper
•
2410.19609
•
Published
•
17
Teaching Embodied Reinforcement Learning Agents: Informativeness and
Diversity of Language Use
Paper
•
2410.24218
•
Published
•
5
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper
•
2410.23218
•
Published
•
46
Adapting While Learning: Grounding LLMs for Scientific Problems with
Intelligent Tool Usage Adaptation
Paper
•
2411.00412
•
Published
•
9
AndroidLab: Training and Systematic Benchmarking of Android Autonomous
Agents
Paper
•
2410.24024
•
Published
•
48
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum
Reinforcement Learning
Paper
•
2411.02337
•
Published
•
35
Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large
Language Model
Paper
•
2411.04496
•
Published
•
22
GazeGen: Gaze-Driven User Interaction for Visual Content Generation
Paper
•
2411.04335
•
Published
•
14
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer
Use
Paper
•
2411.10323
•
Published
•
31
Is Your LLM Secretly a World Model of the Internet? Model-Based Planning
for Web Agents
Paper
•
2411.06559
•
Published
•
11
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Paper
•
2411.13543
•
Published
•
18
SketchAgent: Language-Driven Sequential Sketch Generation
Paper
•
2411.17673
•
Published
•
18
Interleaved Scene Graph for Interleaved Text-and-Image Generation
Assessment
Paper
•
2411.17188
•
Published
•
21
Large Language Model-Brained GUI Agents: A Survey
Paper
•
2411.18279
•
Published
•
27
MALT: Improving Reasoning with Multi-Agent LLM Training
Paper
•
2412.01928
•
Published
•
39
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Paper
•
2412.04454
•
Published
•
54
Unraveling the Complexity of Memory in RL Agents: an Approach for
Classification and Evaluation
Paper
•
2412.06531
•
Published
•
71
The BrowserGym Ecosystem for Web Agent Research
Paper
•
2412.05467
•
Published
•
19
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web
Tutorials
Paper
•
2412.09605
•
Published
•
26
Large Action Models: From Inception to Implementation
Paper
•
2412.10047
•
Published
•
31
Evaluation Agent: Efficient and Promptable Evaluation Framework for
Visual Generative Models
Paper
•
2412.09645
•
Published
•
35
Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation
Model Internet Agents
Paper
•
2412.13194
•
Published
•
12
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World
Tasks
Paper
•
2412.14161
•
Published
•
47
Paper
•
2412.13501
•
Published
•
23
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital
World
Paper
•
2412.17589
•
Published
•
12
Agent-SafetyBench: Evaluating the Safety of LLM Agents
Paper
•
2412.14470
•
Published
•
11
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper
•
2412.21139
•
Published
•
16
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse
Task Synthesis
Paper
•
2412.19723
•
Published
•
63