|
--- |
|
language: |
|
- uz |
|
tags: |
|
- Text Generation |
|
- PyTorch |
|
- TensorFlow |
|
- Transformers |
|
- mit |
|
- uz |
|
- gpt2 |
|
license: apache-2.0 |
|
widget: |
|
- text: "Covid-19 га қарши эмлаш бошланди," |
|
example_title: "Namuna 1" |
|
- text: "Суъний интеллект энг ривожланган" |
|
example_title: "Namuna 2" |
|
--- |
|
|
|
<p><b>GPTuzmodel.</b> |
|
|
|
GPTuz GPT-2 kichik modelga asoslangan Uzbek tili uchun state-of-the-art til modeli. |
|
|
|
Bu model GPU NVIDIA V100 32GB va 0.53 GB malumotlarni kun.uz dan foydalanilgan holda Transfer Learning va Fine-tuning texnikasi asosida 1 kundan ziyod vaqt davomida o'qitilgan. |
|
|
|
<p><b>Qanday foydaniladi</b> |
|
|
|
<pre><code class="language-python"> |
|
|
|
from transformers import AutoTokenizer, AutoModelWithLMHead |
|
import torch |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("rifkat/GPTuz") |
|
model = AutoModelWithLMHead.from_pretrained("rifkat/GPTuz") |
|
|
|
tokenizer.model_max_length=1024 |
|
|
|
</code></pre> |
|
<p><b>Bitta so'z yaratish</b> |
|
<pre><code class="language-python"> |
|
|
|
text = "Covid-19 га қарши эмлаш бошланди," |
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
|
outputs = model(**inputs, labels=inputs["input_ids"]) |
|
loss, logits = outputs[:2] |
|
predicted_index = torch.argmax(logits[0, -1, :]).item() |
|
predicted_text = tokenizer.decode([predicted_index]) |
|
|
|
print('input text:', text) |
|
print('predicted text:', predicted_text) |
|
|
|
</code></pre> |
|
<p><b>Bitta to'liq ketma-ketlikni yarating </b> |
|
|
|
<pre><code class="language-python"> |
|
|
|
text = "Covid-19 га қарши эмлаш бошланди, " |
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
|
|
|
sample_outputs = model.generate(inputs.input_ids, |
|
pad_token_id=50256, |
|
do_sample=True, |
|
max_length=50, # kerakli token raqamini qo'ying |
|
top_k=40, |
|
num_return_sequences=1) |
|
|
|
|
|
for i, sample_output in enumerate(sample_outputs): |
|
print(">> Generated text {}\n\n{}".format(i+1, tokenizer.decode(sample_output.tolist()))) |
|
|
|
</code></pre> |
|
|
|
<pre><code class="language-python"> |
|
@misc {rifkat_davronov_2022, |
|
authors = { {Adilova Fatima,Rifkat Davronov, Samariddin Kushmuratov, Ruzmat Safarov} }, |
|
title = { GPTuz (Revision 2a7e6c0) }, |
|
year = 2022, |
|
url = { https://huggingface.co/rifkat/GPTuz }, |
|
doi = { 10.57967/hf/0143 }, |
|
publisher = { Hugging Face } |
|
} |
|
</code></pre> |