Checkpoint of a Llama 2 7B model that has had 50% of the parameters pruned in one-shot with SparseGPT, then retrained for 40B tokens from SlimPajama while maintaining sparsity.

  • Model: Llama 2
  • Size: 7B
  • LR: 3.00E-4
  • Dataset: SlimPajama
  • Retrained tokens: 40B
  • Notes: no warmup + decay to 0.0
  • Eval Harness:
    • CommonSense Reasoning: 62.2 (97.65%)
    • Reading Comprehension: 57.7 (98.30%)
    • World Knowledge: 42.4 (97.65%)
    • Math: 6.1 (74.39%)
    • Code: 16.2 (98.78%)
Downloads last month
7
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for nm-testing/Llama-2-7b-pruned40-retrained

Finetuned
(744)
this model

Dataset used to train nm-testing/Llama-2-7b-pruned40-retrained