|
--- |
|
language: |
|
- en |
|
license: llama2 |
|
model_name: OpenHathi-7B-Hi-v0.1-Base-gptq |
|
base_model: meta-llama/Llama-2-7b-chat-hf |
|
inference: false |
|
model_creator: SarvamAI |
|
model_type: llama |
|
pipeline_tag: text-generation |
|
prompt_template: '[INST] <<SYS>> |
|
|
|
You are a helpful, respectful and honest assistant. Always answer as helpfully as |
|
possible, while being safe. Your answers should not include any harmful, unethical, |
|
racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses |
|
are socially unbiased and positive in nature. If a question does not make any sense, |
|
or is not factually coherent, explain why instead of answering something not correct. |
|
If you don''t know the answer to a question, please don''t share false information. |
|
|
|
<</SYS>> |
|
|
|
{prompt}[/INST] |
|
|
|
' |
|
quantized_by: cmeraki |
|
--- |
|
|
|
# OpenHathi Base GPTQ |
|
- Model creator: [Sarvam AI](https://huggingface.co/sarvamai) |
|
- Original model: [sarvamai/OpenHathi-7B-Hi-v0.1-Base](https://huggingface.co/sarvamai/OpenHathi-7B-Hi-v0.1-Base/) |
|
|
|
<!-- description start --> |
|
## Description |
|
|
|
This repo contains GPTQ model files for [Sarvam's OpenHathi](https://huggingface.co/sarvamai/OpenHathi-7B-Hi-v0.1-Base/). |
|
|
|
Files are made using AutoGPTQ with following config. |
|
``` |
|
quantization_config : {"bits": 4, |
|
"group_size": 128, |
|
"damp_percent": 0.1, |
|
"desc_act": true, |
|
|
|
} |
|
``` |
|
|
|
We use a custom [dataset](cmeraki/wiki_en_hi) which has both Hindi and English wiki articles. We truncate to max_length=1024 and model may not perform well beyond that context size. |
|
|
|
<!-- description end --> |
|
|
|
<!-- prompt-template start --> |
|
## Prompt template |
|
|
|
This is a base model not tuned for any instructions. Feel free to use any format. Alpaca/Vicuna works fine. |
|
|
|
<!-- prompt-template end --> |
|
|
|
## Oobagooba |
|
Standard oobagooba works with exllama2 / autogptq loader |
|
|
|
## Using in code |
|
|
|
```python |
|
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig |
|
from transformers import AutoTokenizer |
|
|
|
model_dir = 'cmeraki/OpenHathi-7B-Hi-v0.1-Base-gptq' |
|
|
|
model = AutoGPTQForCausalLM.from_quantized(model_dir, device="cuda:0") |
|
tokenizer = AutoTokenizer.from_pretrained(model_dir, fast=True) |
|
tokens = tokenizer("do aur do", return_tensors="pt").to(model.device) |
|
|
|
print(tokenizer.decode(model.generate(**tokens, max_length=1024)[0])) |
|
``` |
|
<!-- README_GPTQ.md-use-from-python end --> |
|
|
|
<!-- README_GPTQ.md-compatibility start --> |
|
|