MobA: A Two-Level Agent System for Efficient Mobile Task Automation Paper • 2410.13757 • Published Oct 17, 2024 • 32 • 3
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper • 2410.12787 • Published Oct 16, 2024 • 31
MobA: A Two-Level Agent System for Efficient Mobile Task Automation Paper • 2410.13757 • Published Oct 17, 2024 • 32 • 3
MobA: A Two-Level Agent System for Efficient Mobile Task Automation Paper • 2410.13757 • Published Oct 17, 2024 • 32
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback Paper • 2403.18349 • Published Mar 27, 2024
META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI Paper • 2205.11029 • Published May 23, 2022
Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding Paper • 2402.18262 • Published Feb 28, 2024
MULTI: Multimodal Understanding Leaderboard with Text and Images Paper • 2402.03173 • Published Feb 5, 2024 • 3
MobA: A Two-Level Agent System for Efficient Mobile Task Automation Paper • 2410.13757 • Published Oct 17, 2024 • 32
MULTI: Multimodal Understanding Leaderboard with Text and Images Paper • 2402.03173 • Published Feb 5, 2024 • 3