-
YAYI 2: Multilingual Open-Source Large Language Models
Paper • 2312.14862 • Published • 13 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 56 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 66 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 44
Collections
Discover the best community collections!
Collections including paper arxiv:2310.11453
-
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
Paper • 2307.13304 • Published • 2 -
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Paper • 2306.03078 • Published • 3 -
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
Paper • 2308.13137 • Published • 17 -
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Paper • 2306.00978 • Published • 9
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 145 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 96 -
ReFT: Representation Finetuning for Language Models
Paper • 2404.03592 • Published • 91 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 257
-
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Paper • 2311.00430 • Published • 57 -
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Paper • 2307.01952 • Published • 84 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83 -
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models
Paper • 2311.00871 • Published • 2
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 96 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 75 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 42 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 41