Czech Poetry TinyLLama

TinyLLama finetuned on Czech poetry from github project by
Institute of Czech Literature, Czech Academy of Sciences.

https://github.com/versotym/corpusCzechVerse

Usage

Use as any other LM style model

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("jinymusim/TinyLlama-Czech-Poet")
model = AutoModelForCausalLM.from_pretrained("jinymusim/TinyLlama-Czech-Poet")

# Input Poet Start
poet_start = '<|AUTHOR|> Adámek, Bohumil'
poet_start = poet_start.strip()
tokenized_poet_start = tokenizer.encode(poet_start, return_tensors='pt')

# generated a continuation to it
out = model.generate(tokenized_poet_start, 
                                max_length=256,
                                do_sample=True,
                                top_k=50
                                early_stopping=True,
                                pad_token_id= tokenizer.pad_token_id,
                                eos_token_id = tokenizer.eos_token_id)

# Decode Poet
decoded_cont = tokenizer.decode(out[0], skip_special_tokens=True)

print(decoded_cont)

Structure of outputs

Outputs are structured in following way:

<|AUTHOR|> AUTHOR
<|TITLE|> TITLE
<|YEAR|> YEAR
<|STROPHE_START|>
<|METER|> METER
<|RHYME|> RHYME SCHEMA
STROPHE
<|STROPHE_END|>
<|STROPHE_START|>
<|METER|> METER
<|RHYME|> RHYME SCHEMA
STROPHE
<|STROPHE_START|>
Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for jinymusim/TinyLlama-Czech-Poet

Finetuned
(1)
this model