Agent-SafetyBench: Evaluating the Safety of LLM Agents Paper • 2412.14470 • Published 18 days ago • 11
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published 28 days ago • 72
DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling Paper • 2412.04905 • Published about 1 month ago • 7
Enhancing Long-form Text Generation in Mental Health with Task-adaptive Tokenization Paper • 2310.05317 • Published Oct 9, 2023
CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models Paper • 2311.16832 • Published Nov 28, 2023 • 1
EmoBench: Evaluating the Emotional Intelligence of Large Language Models Paper • 2402.12071 • Published Feb 19, 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models Paper • 2410.17215 • Published Oct 22, 2024 • 14
Data Selection via Optimal Control for Language Models Paper • 2410.07064 • Published Oct 9, 2024 • 8
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding Paper • 2409.03420 • Published Sep 5, 2024 • 26
Platypus: A Generalized Specialist Model for Reading Text in Various Forms Paper • 2408.14805 • Published Aug 27, 2024 • 13