Update README.md

Browse files

Files changed (1) hide show

README.md +86 -3

README.md CHANGED Viewed

@@ -1,3 +1,86 @@
----
-license: apache-2.0
----

+---
+language:
+- en
+tags:
+- pytorch
+- causal-lm
+- pythia
+- autoround
+- intel
+- intel-autoround
+- awq
+- autoawq
+- woq
+license: apache-2.0
+model_name: Pythia 410m deduped
+base_model: EleutherAI/pythia-410m-deduped
+inference: false
+model_creator: EleutherAI
+datasets:
+- EleutherAI/pile
+pipeline_tag: text-generation
+prompt_template: '{prompt}
+  '
+quantized_by: fbaldassarri
+---
+## Model Information
+Quantized version of [EleutherAI/pythia-410m-deduped](EleutherAI/pythia-410m-deduped) using torch.float32 for quantization tuning.
+- 4 bits (INT4)
+- group size = 128
+- Asymmetrical Quantization
+- Method AutoAWQ
+Quantization framework: [Intel AutoRound](https://github.com/intel/auto-round)
+Note: this INT4 version of pythia-410m-deduped has been quantized to run inference through CPU.
+## Replication Recipe
+### Step 1 Install Requirements
+I suggest to install requirements into a dedicated python-virtualenv or a conda enviroment.
+```
+python -m pip install <package> --upgrade
+```
+- accelerate==1.0.1
+- auto_gptq==0.7.1
+- neural_compressor==3.1
+- torch==2.3.0+cpu
+- torchaudio==2.5.0+cpu
+- torchvision==0.18.0+cpu
+- transformers==4.45.2
+### Step 2 Build Intel Autoround wheel from sources
+```
+python -m pip install git+https://github.com/intel/auto-round.git
+```
+### Step 3 Script for Quantization
+```
+  from transformers import AutoModelForCausalLM, AutoTokenizer
+  model_name = "EleutherAI/pythia-410m-deduped"
+  model = AutoModelForCausalLM.from_pretrained(model_name)
+  tokenizer = AutoTokenizer.from_pretrained(model_name)
+  from auto_round import AutoRound
+  bits, group_size, sym = 4, 128, False
+  autoround = AutoRound(model, tokenizer, nsamples=128, iters=200, seqlen=512, batch_size=4, bits=bits, group_size=group_size, sym=sym)
+  autoround.quantize()
+  output_dir = "./AutoRound/EleutherAI_pythia-410m-deduped-autoawq-int4-gs128-asym"
+  autoround.save_quantized(output_dir, format='auto_awq', inplace=True)
+```
+## License
+[Apache 2.0 License](https://choosealicense.com/licenses/apache-2.0/)
+## Disclaimer
+This quantized model comes with no warrenty. It has been developed only for research purposes.