|
--- |
|
license: other |
|
license_name: llama3 |
|
tags: |
|
- llama-3 |
|
- conversational |
|
--- |
|
# OxxoCodes/Meta-Llama-3-70B-Instruct-GPTQ |
|
*Built with Meta Llama 3* |
|
|
|
Meta Llama 3 is licensed under the Meta Llama 3 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved. |
|
|
|
# Model Description |
|
This is a 4-bit GPTQ quantized version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct). |
|
|
|
This model was quantized using the following quantization config: |
|
```python |
|
quantize_config = BaseQuantizeConfig( |
|
bits=4, |
|
group_size=128, |
|
desc_act=False, |
|
damp_percent=0.1, |
|
) |
|
``` |
|
|
|
To use this model, you need to install AutoGPTQ. |
|
For detailed installation instructions, please refer to the [AutoGPTQ GitHub repository](https://github.com/AutoGPTQ/AutoGPTQ). |
|
|
|
# Example Usage |
|
```python |
|
from auto_gptq import AutoGPTQForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-70B-Instruct") |
|
model = AutoGPTQForCausalLM.from_quantized("OxxoCodes/Meta-Llama-3-70B-Instruct-GPTQ") |
|
|
|
output = model.generate(**tokenizer("The capitol of France is", return_tensors="pt").to(model.device))[0] |
|
print(tokenizer.decode(output)) |
|
``` |