11 15 7

Chenyang Song

Raincleared

AI & ML interests

None yet

Recent Activity

new activity 11 days ago

SparseLLM/prosparse-llama-2-7b:Model not running on CPU, due to flash_attn package requirement.

new activity 18 days ago

SparseLLM/ReluLLaMA-7B:Adding `safetensors` variant of this model

new activity 25 days ago

SparseLLM/sparsing-law-0.1b-relu:Adding `safetensors` variant of this model

View all activity

Organizations

Raincleared's activity

upvoted a paper about 1 month ago

Densing Law of LLMs

Paper • 2412.04315 • Published Dec 5, 2024 • 17

upvoted a paper 2 months ago

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

Paper • 2411.02335 • Published Nov 4, 2024 • 11

upvoted a paper 4 months ago

Configurable Foundation Models: Building LLMs from a Modular Perspective

Paper • 2409.02877 • Published Sep 4, 2024 • 27

upvoted a paper 6 months ago

Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models

Paper • 2406.15718 • Published Jun 22, 2024 • 14

upvoted 2 papers 7 months ago

Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters

Paper • 2406.05955 • Published Jun 10, 2024 • 22

PowerInfer-2: Fast Large Language Model Inference on a Smartphone

Paper • 2406.06282 • Published Jun 10, 2024 • 36

upvoted a collection 7 months ago

MiniCPM

Collection

The MiniCPM family of LLMs and VLLMs. • 31 items • Updated Oct 22, 2024 • 56

upvoted a collection 9 months ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated about 1 month ago • 698

upvoted 3 papers 11 months ago

upvoted 2 papers 12 months ago

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11, 2024 • 44

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 158

upvoted 2 papers about 1 year ago

Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws

Paper • 2401.00448 • Published Dec 31, 2023 • 28

Beyond Surface: Probing LLaMA Across Scales and Layers

Paper • 2312.04333 • Published Dec 7, 2023 • 18