GPT-Grug-355m

A finetuned version of GPT2-Medium on the 'grug' dataset. A demo is available here

If you're interested, there's a smaller model available here: GPT-Grug-125m Do note however that it is very limited by comparison.

Training Procedure

This was trained on the 'grug' dataset, using the "HappyTransformers" library on Google Colab. This model was trained for 4 epochs with learning rate 1e-2. The notebook used to train has been included in this repo.

Biases & Limitations

This likely contains the same biases and limitations as the original GPT2 that it is based on, and additionally heavy biases from the grug datasets.

Intended Use

This model is meant for fun, please do not take anything this caveman says seriously.

Sample Use

#Import model:
from happytransformer import HappyGeneration
happy_gen = HappyGeneration("GPT2", "DarwinAnim8or/gpt-grug-355m")

#Set generation settings:
from happytransformer import GENSettings
args_top_k = GENSettings(no_repeat_ngram_size=2, do_sample=True,top_k=50, temperature=0.7, max_length=50, early_stopping=False)

#Generate a response:
result = happy_gen.generate_text("""Person: "Hello grug"
Grug: "hello person"
###
Person: "how are you grug"
Grug: "grug doing ok. grug find many berry. good for tribe."
###
Person: "what does grug think of new spear weapon?"
Grug: "grug no like new spear weapon. grug stick bigger. spear too small, break easy"
###
Person: "what does grug think of football?"
Grug: \"""", args=args_top_k)

print(result)
print(result.text)
Downloads last month
114
Safetensors
Model size
380M params
Tensor type
F32
·
U8
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train DarwinAnim8or/GPT-Grug-355m