Quantization made by Richard Erkhov. [Github](https://github.com/RichardErkhov) [Discord](https://discord.gg/pvy7H8DZMG) [Request more models](https://github.com/RichardErkhov/quant_request) rho-math-7b-interpreter-v0.1 - GGUF - Model creator: https://huggingface.co/microsoft/ - Original model: https://huggingface.co/microsoft/rho-math-7b-interpreter-v0.1/ | Name | Quant method | Size | | ---- | ---- | ---- | | [rho-math-7b-interpreter-v0.1.Q2_K.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q2_K.gguf) | Q2_K | 2.53GB | | [rho-math-7b-interpreter-v0.1.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.IQ3_XS.gguf) | IQ3_XS | 2.81GB | | [rho-math-7b-interpreter-v0.1.IQ3_S.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.IQ3_S.gguf) | IQ3_S | 2.96GB | | [rho-math-7b-interpreter-v0.1.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q3_K_S.gguf) | Q3_K_S | 2.95GB | | [rho-math-7b-interpreter-v0.1.IQ3_M.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.IQ3_M.gguf) | IQ3_M | 3.06GB | | [rho-math-7b-interpreter-v0.1.Q3_K.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q3_K.gguf) | Q3_K | 3.28GB | | [rho-math-7b-interpreter-v0.1.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q3_K_M.gguf) | Q3_K_M | 3.28GB | | [rho-math-7b-interpreter-v0.1.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q3_K_L.gguf) | Q3_K_L | 3.56GB | | [rho-math-7b-interpreter-v0.1.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.IQ4_XS.gguf) | IQ4_XS | 3.67GB | | [rho-math-7b-interpreter-v0.1.Q4_0.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q4_0.gguf) | Q4_0 | 3.83GB | | [rho-math-7b-interpreter-v0.1.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.IQ4_NL.gguf) | IQ4_NL | 3.87GB | | [rho-math-7b-interpreter-v0.1.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q4_K_S.gguf) | Q4_K_S | 3.86GB | | [rho-math-7b-interpreter-v0.1.Q4_K.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q4_K.gguf) | Q4_K | 4.07GB | | [rho-math-7b-interpreter-v0.1.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q4_K_M.gguf) | Q4_K_M | 4.07GB | | [rho-math-7b-interpreter-v0.1.Q4_1.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q4_1.gguf) | Q4_1 | 4.24GB | | [rho-math-7b-interpreter-v0.1.Q5_0.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q5_0.gguf) | Q5_0 | 4.65GB | | [rho-math-7b-interpreter-v0.1.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q5_K_S.gguf) | Q5_K_S | 4.65GB | | [rho-math-7b-interpreter-v0.1.Q5_K.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q5_K.gguf) | Q5_K | 4.78GB | | [rho-math-7b-interpreter-v0.1.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q5_K_M.gguf) | Q5_K_M | 4.78GB | | [rho-math-7b-interpreter-v0.1.Q5_1.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q5_1.gguf) | Q5_1 | 5.07GB | | [rho-math-7b-interpreter-v0.1.Q6_K.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q6_K.gguf) | Q6_K | 5.53GB | | [rho-math-7b-interpreter-v0.1.Q8_0.gguf](https://huggingface.co/RichardErkhov/microsoft_-_rho-math-7b-interpreter-v0.1-gguf/blob/main/rho-math-7b-interpreter-v0.1.Q8_0.gguf) | Q8_0 | 7.17GB | Original model description: --- license: mit tags: - nlp - math language: - en pipeline_tag: text-generation ---
[📜 Arxiv] • [💬 HF Paper] • [🤗 Models] • [🐱 GitHub]
Figure 1: Rho-1 is pre-trained with Selective Language Modeling (SLM). SLM improves average few-shot accuracy on GSM8k and MATH by over 16%, achieving the baseline performance 5-10x faster.
Figure 2:
Upper: Even an extensively filtered pretraining corpus contains token-level noise.
Left: Previous Causal Language Modeling (CLM) trains on all tokens.
Right: Our proposed Selective Language Modeling (SLM) selectively applies loss on those useful and clean tokens.
Figure 3: The pipeline of Selective Language Modeling.
SLM optimizes language model performance by concentrating on valuable, clean tokens during pre-training.
It involves three steps:
(Step 1) Initially, train a reference model on high-quality data.
(Step 2) Then, score each token's loss in a corpus using the reference model.
(Step 3) Finally, train the language model selectively on tokens that show higher excess loss compared to the reference loss.