File size: 1,426 Bytes
f1a0ec9 611fa00 f1a0ec9 611fa00 f1a0ec9 e113261 f1a0ec9 e113261 f1a0ec9 e113261 f1a0ec9 e113261 f1a0ec9 e113261 f1a0ec9 e113261 f1a0ec9 e113261 f1a0ec9 e113261 f1a0ec9 e113261 f1a0ec9 e113261 f1a0ec9 7e23677 f1a0ec9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
---
language: tr
tags:
- turkish
- tr
- gpt2-tr
- gpt2-turkish
license: mit
metrics:
- accuracy
---
# Turkish GPT-2 Model (Experimental)
I've made available a GPT-2 model for Turkish that I trained on a variety of texts.
The model is intended to serve as a starting point for text-specific adjustments.
## Training Source
I used a Turkish corpus that is taken from different written and oral sources.
I developed a LLM model with 50k vocabulary using the Custom Tokenizers library using the training resources.
I could train the GPT-2 for Turkish using the entire training corpus (ten epochs) after developing the vocabulary.
## Using the model
The model itself can be used in this way:
``` python
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("ahmet1338/gpt-2-experimental")
model = AutoModelWithLMHead.from_pretrained("ahmet1338/gpt-2-experimental")
```
To generating text, we can use these lines of code which is Transformers Pipelines:
``` python
from transformers import pipeline
pipe = pipeline('text-generation', model="ahmet1338/gpt-2-experimental",
tokenizer="ahmet1338/gpt-2-experimental", config={'max_length':800})
text = pipe("Akşamüstü yolda ilerlerken, ")[0]["generated_text"]
print(text)
```
### How to clone the model repo?
```
git lfs install
git clone https://huggingface.co/ahmet1338/gpt-2-experimential
```
|