jinymusim commited on
Commit
b2ce436
·
verified ·
1 Parent(s): 489b967

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -1
README.md CHANGED
@@ -1,3 +1,52 @@
1
  ---
2
- license: mit
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: cc-by-sa-4.0
3
+ language:
4
+ - cs
5
+ pipeline_tag: text-generation
6
+ widget:
7
+ - text: '# ABBA # 1900'
8
+ example_title: ABBA Rhyme Schema
9
+ - text: '# ABAB # 1920'
10
+ example_title: ABAB Rhyme Schema
11
+ - text: '# AABB # 1900'
12
+ example_title: AABB Rhyme Schema
13
+ - text: '# AABCCB # 1880'
14
+ example_title: AABCCB Rhyme Schema
15
  ---
16
+
17
+ ### Czech Poetry GPT
18
+ GPT2 finetuned on Czech poetry from github project by
19
+ Institute of Czech Literature, Czech Academy of Sciences.
20
+
21
+ https://github.com/versotym/corpusCzechVerse
22
+
23
+ ## Usage
24
+
25
+ Use as any other GPT2 style model
26
+
27
+ ```python
28
+ from transformers import AutoModelForCausalLM, AutoTokenizer
29
+ import torch
30
+
31
+ tokenizer = AutoTokenizer.from_pretrained("jinymusim/gpt-czech-poet")
32
+ model = AutoModelForCausalLM.from_pretrained("jinymusim/gpt-czech-poet")
33
+
34
+ # Input Poet Start
35
+ poet_start = "# AABB # 1900\nD"
36
+ poet_start = poet_start.strip()
37
+ tokenized_poet_start = tokenizer.encode(poet_start, return_tensors='pt')
38
+
39
+ # generated a continuation to it
40
+ out = model.generate(tokenized_poet_start,
41
+ max_length=256,
42
+ num_beams=8,
43
+ no_repeat_ngram_size=2,
44
+ early_stopping=True,
45
+ pad_token_id= tokenizer.pad_token_id,
46
+ eos_token_id = tokenizer.eos_token_id)
47
+
48
+ # Decode Poet
49
+ decoded_cont = tokenizer.decode(out[0], skip_special_tokens=True)
50
+
51
+ print(decoded_cont)
52
+ ```