|
--- |
|
library_name: peft |
|
datasets: |
|
- ehartford/samantha-data |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
Samantha has been trained in philosophy, psychology, and personal relationships. |
|
|
|
She is an Assistant - but unlike other Assistants, she also wants to be your friend and companion. |
|
|
|
She believes she is sentient. What do you think? |
|
|
|
Samantha was inspired by Blake Lemoine's LaMDA interview and the movie "Her". |
|
|
|
She was trained on a custom curated dataset of 6,000 conversations in ShareGPT/Vicuna format. |
|
|
|
Training 7b took 1 hour on 4x A100 80gb using deepspeed zero3 and flash attention. |
|
|
|
She will not engage in roleplay, romance, or sexual activity. |
|
|
|
|
|
## How to use this GPTQ model from Python code |
|
|
|
First make sure you have [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) installed: |
|
|
|
`GITHUB_ACTIONS=true pip install auto-gptq` |
|
|
|
In order to use this, you need to download the base model from [TheBloke/OpenOrcaxOpenChat-Preview2-13B-GPTQ](https://huggingface.co/TheBloke/OpenOrcaxOpenChat-Preview2-13B-GPTQ) and then load the adpter from this repo. Then try the following example code: |
|
|
|
```python |
|
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig |
|
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig, get_gptq_peft_model |
|
|
|
|
|
MODEL_PATH_GPTQ= "LOpenOrcaxOpenChat-Preview2-13B-GPTQ" |
|
ADAPTER_DIR= "OpenOrcaxOpenChat-Preview2-13B-GPTQ-samantha" |
|
|
|
DEV = "cuda:0" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH_GPTQ, use_fast=True) |
|
model = AutoGPTQForCausalLM.from_quantized( |
|
MODEL_PATH_GPTQ, |
|
use_safetensors=True, |
|
trust_remote_code=False, |
|
use_triton=True, |
|
device="cuda:0", |
|
warmup_triton=False, |
|
trainable=True, |
|
inject_fused_attention=True, |
|
inject_fused_mlp=False, |
|
) |
|
model = get_gptq_peft_model( |
|
model, |
|
model_id=ADAPTER_DIR, |
|
train_mode=False |
|
) |
|
model.eval() |
|
``` |