Llama-2-7b-Ukrainian
Model Details
Model Description
Llama-2-7b-Ukrainian is a bilingual pre-trained model supporting Ukrainian and English. Continued pre-training from Llama-2-7b on 5B tokens consisting of 75% Ukrainian documents and 25% English documents from CulturaX.
Paper: To Err Is Human, but Llamas Can Learn It Too
Training Hyperparameters
Hyperparameter | Value |
---|---|
Training steps | 19080 |
Batch size | 256 |
Weight decay | 0.1 |
Context length | 1024 |
Learning rate | 2e-5 linear decay to 2e-6 |
Precision | bf16 |
Optimizer | AdamW |
Citation
BibTeX:
@article{luhtaru2024err,
title={To Err Is Human, but Llamas Can Learn It Too},
author={Luhtaru, Agnes and Purason, Taido and Vainikko, Martin and Del, Maksym and Fishel, Mark},
journal={arXiv preprint arXiv:2403.05493},
year={2024}
}
- Downloads last month
- 142
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.