忍者's picture

118 308

忍者

byteprobe

·

AI & ML interests

RL | NLP | LLM | LMM | agent

Recent Activity

liked a model 1 day ago

jxm/cde-small-v2

liked a model 2 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

upvoted an article 2 days ago

🐺🐦‍⬛ LLM Comparison/Test: Phi-4, Qwen2 VL 72B Instruct, Aya Expanse 32B in my updated MMLU-Pro CS benchmark

View all activity

Organizations

byteprobe's activity

upvoted an article 2 days ago

Article

🐺🐦‍⬛ LLM Comparison/Test: Phi-4, Qwen2 VL 72B Instruct, Aya Expanse 32B in my updated MMLU-Pro CS benchmark

By

•

12 days ago

• 3

upvoted 19 papers 2 days ago

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published 6 days ago • 35

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published 6 days ago • 89

Learnings from Scaling Visual Tokenizers for Reconstruction and Generation

Paper • 2501.09755 • Published 6 days ago • 32

OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking

Paper • 2501.09751 • Published 6 days ago • 43

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 6 days ago • 63

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

Paper • 2412.21187 • Published 23 days ago • 36

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published 26 days ago • 81

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published 20 days ago • 48

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published 21 days ago • 97

Test-time Computing: from System-1 Thinking to System-2 Thinking

Paper • 2501.02497 • Published 17 days ago • 41

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published 18 days ago • 86

Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published 15 days ago • 66

URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics

Paper • 2501.04686 • Published 14 days ago • 50

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published 13 days ago • 77

Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published 14 days ago • 80

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published 14 days ago • 87

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 14 days ago • 243

Enhancing Human-Like Responses in Large Language Models

Paper • 2501.05032 • Published 13 days ago • 49

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published 12 days ago • 58