MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design Paper • 2412.14590 • Published 18 days ago • 13
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation Paper • 2412.13649 • Published 19 days ago • 20
Offline Reinforcement Learning for LLM Multi-Step Reasoning Paper • 2412.16145 • Published 16 days ago • 36
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning Paper • 2412.16849 • Published 15 days ago • 7
NILE: Internal Consistency Alignment in Large Language Models Paper • 2412.16686 • Published 16 days ago • 8
Agent-SafetyBench: Evaluating the Safety of LLM Agents Paper • 2412.14470 • Published 18 days ago • 11
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought Paper • 2412.17498 • Published 14 days ago • 21
Revisiting In-Context Learning with Long Context Language Models Paper • 2412.16926 • Published 15 days ago • 27
Large Motion Video Autoencoding with Cross-modal Video VAE Paper • 2412.17805 • Published 13 days ago • 23
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning Paper • 2412.15797 • Published 17 days ago • 16
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper • 2412.18319 • Published 13 days ago • 34
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published 12 days ago • 86
OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System Paper • 2412.20005 • Published 9 days ago • 13