Xi's picture

Xi

xi0v

·

AI & ML interests

Diffusion Model Merging, LLM Merging, Model Editing and Vision/Multimodal Model Fine-tuning.

Recent Activity

liked a model about 4 hours ago

Yntec/noobaiiter-xl-v10-sdxl

liked a model about 12 hours ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

liked a model about 12 hours ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

View all activity

Organizations

xi0v's activity

upvoted an article about 14 hours ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

6 days ago

• 113

upvoted a paper 5 days ago

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published 7 days ago • 44

upvoted a paper 9 days ago

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published 12 days ago • 77

upvoted a paper 10 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 13 days ago • 237

upvoted a paper 12 days ago

DeMo: Decoupled Momentum Optimization

Paper • 2411.19870 • Published Nov 29, 2024 • 5

upvoted a paper 25 days ago

Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching

Paper • 2412.17153 • Published 29 days ago • 34

upvoted an article 27 days ago

Article

Deriving DPO's Loss

By

•

28 days ago

• 26

upvoted a paper about 1 month ago

Autoregressive Video Generation without Vector Quantization

Paper • 2412.14169 • Published Dec 18, 2024 • 14

upvoted a collection about 1 month ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 125

upvoted a paper about 1 month ago

ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer

Paper • 2412.07720 • Published Dec 10, 2024 • 30

upvoted a collection about 1 month ago

EXAONE-3.5

EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B. • 10 items • Updated Dec 10, 2024 • 88

upvoted a paper about 1 month ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published Dec 4, 2024 • 46

upvoted 2 collections about 2 months ago

Toxic Commons

Tools for de-toxifying public domain data, especially multilingual and historical text data and data with OCR errors. • 3 items • Updated Oct 31, 2024 • 5

Common Models

The first generation of models pretrained on Common Corpus. • 5 items • Updated Dec 5, 2024 • 28

upvoted 2 articles about 2 months ago

Article

They Said It Couldn’t Be Done

By

•

Dec 5, 2024

• 77

Article

EuroLLM-9B

By

•

Dec 2, 2024

• 105

upvoted 2 papers about 2 months ago

Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens

Paper • 2411.17691 • Published Nov 26, 2024 • 11

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 58

upvoted a collection about 2 months ago

ESFT

models for paper expert-specialized fine-tuning • 15 items • Updated Aug 16, 2024 • 5

upvoted a paper about 2 months ago

Natural Language Reinforcement Learning

Paper • 2411.14251 • Published Nov 21, 2024 • 28