metadata

library_name: peft
license: mit
language:
  - en
pipeline_tag: text-generation

AlpaGo: GPT-NeoX-20B Model Trained with Qlora Technique

AlpaGo is an adapter model trained using the Qlora technique on top of the GPT-NeoX-20B model. This repository contains the code and resources for AlpaGo, which can be used for natural language processing tasks. AlpaGo is built on the GPT-NeoX-20B architecture and developed by Math And AI Institute.

Features

AlpaGo adapter model trained with the Qlora technique
Based on the GPT-NeoX-20B model, providing high-quality natural language processing capabilities on Engilish Language

Evaluation

Coming soon

Usage

You can utilize AlpaGo to perform natural language processing tasks. Here's an example of how to use it:

To try via Google Colab Free:

You can even run it on your own computer if you want. But the warning only works on GPUs with at least 15gb vram.

from peft import PeftModel
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, GenerationConfig
model_id = "EleutherAI/gpt-neox-20b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")
model = PeftModel.from_pretrained(model, "myzens/AlpaGo")

#You can change Here.
PROMPT = """Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Write a short story about a lost key that unlocks a mysterious door.
### Response:"""

inputs = tokenizer(PROMPT, return_tensors="pt")
input_ids = inputs["input_ids"].cuda()

generation_config = GenerationConfig(
    temperature=0.6,
    top_p=0.95,
    repetition_penalty=1.15,

)

print("Generating...")
generation_output = model.generate(
    input_ids=input_ids,
    generation_config=generation_config,
    return_dict_in_generate=True,
    output_scores=True,
    max_new_tokens=256,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
)

for s in generation_output.sequences:
    print(tokenizer.decode(s))