ekurtic commited on
Commit
463e12e
·
verified ·
1 Parent(s): facc56f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: meta-llama/Meta-Llama-3-8B
3
+ inference: true
4
+ model_type: llama
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - sparse
8
+ ---
9
+
10
+ # Meta-Llama-3-8B-pruned_50.2of4
11
+
12
+ This repo contains model files for a 2:4 (N:M) sparse [Meta-Llama-3-8B](meta-llama/Meta-Llama-3-8B) model pruned in one-shot with [SparseGPT](https://arxiv.org/abs/2301.00774), and then additionally retrained with the [SquareHead](https://arxiv.org/abs/2310.06927) knowledge distillation while maintaining the 2:4 sparsity mask.
13
+
14
+ ### Running the model
15
+
16
+ ```python
17
+ # pip install transformers accelerate
18
+ from transformers import AutoTokenizer, AutoModelForCausalLM
19
+
20
+ tokenizer = AutoTokenizer.from_pretrained("nm-testing/Meta-Llama-3-8B-pruned_50.2of4")
21
+ model = AutoModelForCausalLM.from_pretrained("nm-testing/Meta-Llama-3-8B-pruned_50.2of4", device_map="auto")
22
+
23
+ input_text = "A poem about Machine Learning goes as follows:"
24
+ input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
25
+
26
+ outputs = model.generate(**input_ids)
27
+ print(tokenizer.decode(outputs[0]))
28
+ ```
29
+
30
+ ## Evaluation Benchmark Results
31
+
32
+ Model evaluation results obtained via [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) following the configuration of [Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard).
33
+
34
+ | Benchmark | Meta-Llama-3-8B | Meta-Llama-3-8B-pruned_50.2of4<br>(this model) |
35
+ |:----------------------------------------------:|:-----------:|:-----------------------------:|
36
+ | [ARC-c](https://arxiv.org/abs/1911.01547)<br> 25-shot | 59.47% | 57.76% |
37
+ | [MMLU](https://arxiv.org/abs/2009.03300)<br> 5-shot | 65.29% | 60.44% |
38
+ | [HellaSwag](https://arxiv.org/abs/1905.07830)<br> 10-shot |82.14% | 79.97% |
39
+ | [WinoGrande](https://arxiv.org/abs/1907.10641)<br> 5-shot |77.27% | 77.19% |
40
+ | [GSM8K](https://arxiv.org/abs/2110.14168)<br> 5-shot | 44.81% | 47.92% |
41
+ | [TruthfulQA](https://arxiv.org/abs/2109.07958)<br> 0-shot | 43.96% | 41.02% |
42
+ | **Average<br>Accuracy** | **62.16%** | **60.72%** |
43
+ | **Recovery** | **100%** | **97.68%** |
44
+
45
+
46
+ Model evaluation results obtained via [Mosaic Eval Gauntlet](https://github.com/mosaicml/llm-foundry/blob/main/scripts/eval/local_data/EVAL_GAUNTLET.md) following the configuration of [Eval Gauntlet v0.3](https://github.com/mosaicml/llm-foundry/blob/main/scripts/eval/yamls/eval_gauntlet_v0.3.yaml).
47
+
48
+ | Benchmark | Meta-Llama-3-8B | Meta-Llama-3-8B-pruned_50.2of4<br>(this model) |
49
+ |:------------------------:|:----------------:|:----------------------------------------------:|
50
+ | World Knowledge | 58.08% | 54.61% |
51
+ | Commonsense Reasoning | 47.66% | 47.62% |
52
+ | Language Understanding | 71.13% | 67.58% |
53
+ | Symbolic Problem Solving | 38.44% | 32.15% |
54
+ | Reading Comprehension | 57.48% | 55.76% |
55
+ | **Average Accuracy** | **54.70%** | **51.54%** |
56
+ | **Recovery** | **100%** | **94.22%** |
57
+
58
+
59
+ ## Help
60
+
61
+ For further support, and discussions on these models and AI in general, join [Neural Magic's Slack Community](https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ)