Fan Zhou's picture

Fan Zhou

koalazf99

·

https://koalazf99.github.io/

AI & ML interests

Deep Learning; Natural Language Processing; Foundation Models

Recent Activity

authored a paper 6 days ago

Diving into Self-Evolving Training for Multimodal Reasoning

liked a model 11 days ago

hkust-nlp/mstar-prm-8b-v1.0

upvoted a collection 11 days ago

View all activity

Organizations

koalazf99's activity

upvoted a collection 11 days ago

M-STAR

Resources of M-STAR (Multimodal Self-Evolving Training for Reasoning) https://mstar-lmm.github.io/ • 2 items • Updated 12 days ago • 2

upvoted 2 papers 13 days ago

Diving into Self-Evolving Training for Multimodal Reasoning

Paper • 2412.17451 • Published 14 days ago • 41

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published 14 days ago • 44

upvoted a paper 15 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 18 days ago • 335

upvoted a paper 22 days ago

AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials

Paper • 2412.09605 • Published 24 days ago • 26

upvoted a paper about 1 month ago

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Paper • 2412.04454 • Published Dec 5, 2024 • 57

upvoted 5 collections about 1 month ago

Sailor2 Models

6 items • Updated Dec 4, 2024 • 4

Sailor2 Benchmarks

1 item • Updated Dec 3, 2024 • 2

Sailor2 Pre-training Datasets

8 items • Updated Dec 4, 2024 • 4

Sailor2 Post-training Datasets

3 items • Updated Dec 3, 2024 • 5

🔱 Sailor2 Language Models

Sailing in South-East Asia with Inclusive Multilingual LLMs • 9 items • Updated Dec 3, 2024 • 22

upvoted a paper about 1 month ago

When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training

Paper • 2411.13476 • Published Nov 20, 2024 • 15

upvoted a paper about 2 months ago

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 113

upvoted 2 collections 2 months ago

💡 DICE

Self-alignment with DPO Implicit Rewards • 5 items • Updated Jul 28, 2024 • 9

🫐 ProX Projects

Collection for: "Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale" • 18 items • Updated 25 days ago • 2

upvoted a paper 3 months ago

Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining

Paper • 2409.02326 • Published Sep 3, 2024 • 18

upvoted a collection 3 months ago

Llama-3.1-Nemotron-70B

SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated about 10 hours ago • 149

upvoted a paper 3 months ago

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

Paper • 2410.07137 • Published Oct 9, 2024 • 7

upvoted 2 collections 3 months ago

📑Trending Papers - September 9⃣️

10 items • Updated 13 days ago • 9

ProX Refining Models

Adapted small language models used to generate data refining programs • 5 items • Updated Oct 10, 2024 • 2