File size: 4,131 Bytes
583e96f 85d3e38 583e96f 85d3e38 2cff62d 6bfdd75 85d3e38 2cff62d 85d3e38 f42870d c657362 85d3e38 c39062e 85d3e38 c39062e 85d3e38 c39062e 85d3e38 2cff62d 85d3e38 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
---
license: mit
datasets:
- wikimedia/wikipedia
language:
- ka
- en
pipeline_tag: text-generation
---
# GPT-2Geo: Georgian Language Model ๐ฌ๐ช
> โ ๏ธ This GPT-2Geo model is not fully trained due to hardware constraints. It has been trained on a subset of 1000 samples for 20 epochs. The model's capabilities and performance are indicative within these limitations. Future iterations may benefit from extended training on more extensive datasets. Please be mindful of these training constraints when utilizing the model.
## Overview
GPT-2Geo is a powerful language model tailored for the Georgian language, built upon OpenAI's GPT-2 architecture. This model is designed for various natural language processing tasks, including text generation and understanding. [Github (training script)](https://github.com/Kuduxaaa/gpt2-geo)
## Features
- **Georgian Language Model:** Specifically trained to understand and generate text in the Georgian language.
- **GPT-2 Architecture:** Built upon OpenAI's GPT-2, providing a versatile and efficient language model.
- **Easy Integration:** Seamless integration with the Hugging Face Transformers library.
## Training Information
### Environment:
- **GPU:** Nvidia T4 (15GB)
- **Model Memory Requirement:** Minimum 13.5GB
### Training Configuration:
- **Number of Epochs:** 20
- **Time Consumed:** 49 minutes
![#loss](https://i.ibb.co/HhFFX6Z/loss-stat.png)
### Training Progress:
The GPT-2Geo model underwent training in a high-performance environment utilizing the **Nvidia T4 GPU** with **15GB** of dedicated memory. This powerful hardware met the minimum model memory requirement of **13.5GB**, ensuring optimal performance during the training process.
The training configuration included **20 epochs**, allowing the model to iteratively learn from the dataset. The entire training procedure was completed in a time-efficient manner, consuming approximately **49 minutes**.
For detailed insights into the model's performance, refer to the training logs, which capture key metrics such as validation loss over epochs. This information provides users with a comprehensive understanding of the training environment, configuration, and progress.
Ensure that your GPU environment is correctly configured to harness the full potential of the available hardware during the training phase. **Before start training process it needs to preprocess text data and it will added in future**
## Example Usage
```python
import torch
from transformers import GPT2LMHeadModel, GPT2Config, ElectraTokenizerFast
model_name = 'Kuduxaaa/gpt2-geo'
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = ElectraTokenizerFast.from_pretrained(model_name)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
prompt = 'แฅแแ แแฃแ แแแแแแแแแแจแ '
input_ids = tokenizer.encode(prompt, return_tensors='pt').to(device)
output = model.generate(
input_ids,
max_length = 100,
num_beams = 5,
no_repeat_ngram_size = 2,
top_k = 50,
top_p = 0.95,
temperature = 0.7
)
result = tokenizer.decode(output[0], skip_special_tokens=True)
print(result)
# แฅแแ แแฃแ แแแแแแแแแแจแ, แแแแแแแก แแแ แกแแแแแแแ แแ. แแแแแแ แแแแแแจแแ แแแฃแแ แแแแฃแ แ แฌแแ แแแจแแแแก, แ แแแแแแช แฌแแ แแแแจแแ แแแแ แแแ แแแแช, แ แแ แแ แแแ แแแแจแ แแ แกแฎแแ แกแฎแแ. แแแ แแแแ แแแแแแแ แฌแแ แแแแแแแแก แแแแแแแแ แแ แแแ, แ แแแแ แช แกแแจแฃแแแแแแแ, แกแแคแฃแซแแแแ แฌแแ แแแแแแก แฌแแ แกแฃแแจแ. แแแขแแ แแขแฃแ แ แฌแแ แแแแแแแแแแแก แแแ
```
## Acknowledgments
This project is made possible by the contributions of Nika Kudukashvili and the open-source community. Special thanks to OpenAI for the GPT-2 architecture and `jnz/electra-ka` for georgian tokenizer. |