xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning Paper • 2401.07037 • Published Jan 13, 2024 • 2
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! Paper • 2402.12343 • Published Feb 19, 2024
m3P: Towards Multimodal Multilingual Translation with Multimodal Prompt Paper • 2403.17556 • Published Mar 26, 2024 • 1
The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis Paper • 2404.01204 • Published Apr 1, 2024
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model Paper • 2404.04167 • Published Apr 5, 2024 • 12
MuPT: A Generative Symbolic Music Pretrained Transformer Paper • 2404.06393 • Published Apr 9, 2024 • 14
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models Paper • 2406.01359 • Published Jun 3, 2024
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models Paper • 2406.01375 • Published Jun 3, 2024
II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models Paper • 2406.05862 • Published Jun 9, 2024 • 4
UniCoder: Scaling Code Large Language Model via Universal Code Paper • 2406.16441 • Published Jun 24, 2024 • 2
GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models Paper • 2406.14550 • Published Jun 20, 2024 • 4
MMRA: A Benchmark for Multi-granularity Multi-image Relational Association Paper • 2407.17379 • Published Jul 24, 2024 • 2
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm Paper • 2408.08072 • Published Aug 15, 2024 • 32
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models Paper • 2410.11710 • Published Oct 15, 2024 • 19
KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks Paper • 2410.06526 • Published Oct 9, 2024
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions Paper • 2410.20424 • Published Oct 27, 2024 • 39
Aligning CodeLLMs with Direct Preference Optimization Paper • 2410.18585 • Published Oct 24, 2024
Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models Paper • 2411.07140 • Published Nov 11, 2024 • 33
Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models Paper • 2412.15265 • Published 20 days ago