--- base_model: - meta-llama/Llama-3.1-8B-Instruct license: llama3.1 language: - gl metrics: - bleu - rouge model-index: - name: Llama-3.1-8B-Instruct-Galician results: - task: type: text-generation dataset: name: alpaca_data_galician type: alpaca_data_galician metrics: - name: bleu type: bleu-4 value: 23.13 - name: rouge type: rouge-l value: 21.84 pipeline_tag: text-generation library_name: transformers --- # Llama-3.1-8B-Instruct-Galician This model is a continued pretraining version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) on the [CorpusNós](https://zenodo.org/records/11655219) dataset. ## Model Details ### Model Description - **Developed by:** [UDC Information Retrieval Lab (IRLab)](https://huggingface.co/irlab-udc) - **Model type:** [More Information Needed] - **Language(s) (NLP):** Multilingual, adapted to Galician - **License:** llama3.1 - **Finetuned from model:** [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) ### Model Sources - **Repository:** [Adapting Large Language Models for Underrepresented Languages](https://gitlab.irlab.org/eliseo.bao/xovetic-llms-underrepresented-languages) - **Paper:** _Coming soon_ ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details [More Information Needed] ### Training Data [More Information Needed] ### Training Procedure [More Information Needed] #### Training Hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 32 - eval_batch_size: 1 - seed: 42 - distributed_type: multi-GPU - num_devices: 4 - gradient_accumulation_steps: 2 - total_train_batch_size: 256 - total_eval_batch_size: 4 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 1.0 #### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 2.0606 | 0.1682 | 900 | 2.0613 | | 1.9898 | 0.3363 | 1800 | 1.9929 | | 1.9847 | 0.5045 | 2700 | 1.9613 | | 1.9577 | 0.6726 | 3600 | 1.9445 | | 1.9287 | 0.8408 | 4500 | 1.9368 | ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** 4x NVIDIA A100 SXM4 80 GB (TDP of 400W) - **Hours used:** 60 - **Cloud Provider:** Private infrastructure - **Carbon Emitted:** 10.37 kgCO$_2$eq #### Software - PEFT 0.12.0 - Transformers 4.44.2 - Pytorch 2.4.0+cu121 - Datasets 2.21.0 - Tokenizers 0.19.1 ## Citation **BibTeX:** _Coming soon_