mgoin commited on
Commit
4b35527
·
verified ·
1 Parent(s): 0229442

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -15,6 +15,7 @@ This repo contains model files for a 2:4 (N:M) sparse [Meta-Llama-3-8B](meta-lla
15
 
16
  ## Running the model
17
 
 
18
  ```python
19
  # pip install transformers accelerate
20
  from transformers import AutoTokenizer, AutoModelForCausalLM
@@ -29,6 +30,23 @@ outputs = model.generate(**input_ids)
29
  print(tokenizer.decode(outputs[0]))
30
  ```
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  ## Evaluation Benchmark Results
33
 
34
  Model evaluation results obtained via [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) following the configuration of [Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard).
 
15
 
16
  ## Running the model
17
 
18
+ It can be run naively in transformers for testing purposes:
19
  ```python
20
  # pip install transformers accelerate
21
  from transformers import AutoTokenizer, AutoModelForCausalLM
 
30
  print(tokenizer.decode(outputs[0]))
31
  ```
32
 
33
+ To take advantage of the 2:4 sparsity present, install [nm-vllm](https://github.com/neuralmagic/nm-vllm) for fast inference and low memory-usage:
34
+ ```bash
35
+ pip install nm-vllm[sparse] --extra-index-url https://pypi.neuralmagic.com/simple
36
+ ```
37
+
38
+ ```python
39
+ from vllm import LLM, SamplingParams
40
+
41
+ model = LLM("nm-testing/SparseLlama-3-8B-pruned_50.2of4", sparsity="semi_structured_sparse_w16a16")
42
+
43
+ prompt = "A poem about Machine Learning goes as follows:"
44
+ sampling_params = SamplingParams(max_tokens=100, temperature=0)
45
+
46
+ outputs = model.generate(prompt, sampling_params=sampling_params)
47
+ print(outputs[0].outputs[0].text)
48
+ ```
49
+
50
  ## Evaluation Benchmark Results
51
 
52
  Model evaluation results obtained via [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) following the configuration of [Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard).