Xuxi Chen's picture

6

Xuxi Chen

Xuxi

AI & ML interests

None yet

Recent Activity

upvoted a paper 18 days ago

Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN

upvoted a paper 27 days ago

APOLLO: SGD-like Memory, AdamW-level Performance

View all activity

Organizations

None yet

Xuxi's activity

upvoted a paper 18 days ago

Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN

Paper • 2412.13795 • Published 19 days ago • 18

upvoted a paper 27 days ago

APOLLO: SGD-like Memory, AdamW-level Performance

Paper • 2412.05270 • Published about 1 month ago • 38

upvoted a paper 6 months ago

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

Paper • 2407.08296 • Published Jul 11, 2024 • 31

upvoted a paper 8 months ago

A Careful Examination of Large Language Model Performance on Grade School Arithmetic

Paper • 2405.00332 • Published May 1, 2024 • 30

authored a paper 8 months ago

Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality

Paper • 2310.06982 • Published Oct 10, 2023

upvoted a paper 10 months ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 183

upvoted a paper about 1 year ago

Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 71

authored a paper about 1 year ago

Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 71