|
--- |
|
library_name: transformers |
|
license: gemma |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This is a model with 4-bit quantization using the following code. |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
from transformers import BitsAndBytesConfig |
|
import torch |
|
|
|
model_name = "google/gemma-2-27b" |
|
quantization_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_compute_dtype=torch.bfloat16, |
|
bnb_4bit_use_double_quant=True, |
|
bnb_4bit_quant_type="nf4" |
|
) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
quantization_config=quantization_config, |
|
device_map="auto" |
|
) |
|
|
|
|
|
``` |