irlab-udc
/

Llama-3.1-8B-Instruct-Galician

@@ -34,22 +34,13 @@ widget:
 This model is a continued pretraining version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) on the [CorpusNós](https://zenodo.org/records/11655219) dataset.
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
 - **Developed by:** [UDC Information Retrieval Lab (IRLab)](https://huggingface.co/irlab-udc)
 - **Model type:** [More Information Needed]
 - **Language(s) (NLP):** Multilingual, adapted to Galician
 - **License:** llama3.1
 - **Finetuned from model:** [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
-### Model Sources
 - **Repository:** [Adapting Large Language Models for Underrepresented Languages](https://gitlab.irlab.org/eliseo.bao/xovetic-llms-underrepresented-languages)
 - **Paper:** _Coming soon_
@@ -67,26 +58,24 @@ Use the code below to get started with the model.
 [More Information Needed]
-### Training Procedure
-[More Information Needed]
 #### Training Hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 0.0001
-- train_batch_size: 32
-- eval_batch_size: 1
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 4
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 256
-- total_eval_batch_size: 4
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 1.0
 #### Training results
@@ -107,14 +96,6 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 - **Cloud Provider:** Private infrastructure
 - **Carbon Emitted:** 10.37 kgCO$_2$eq
-#### Software
-- PEFT 0.12.0
-- Transformers 4.44.2
-- Pytorch 2.4.0+cu121
-- Datasets 2.21.0
-- Tokenizers 0.19.1
 ## Citation
 _Coming soon_

 This model is a continued pretraining version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) on the [CorpusNós](https://zenodo.org/records/11655219) dataset.
+## Model Description
 - **Developed by:** [UDC Information Retrieval Lab (IRLab)](https://huggingface.co/irlab-udc)
 - **Model type:** [More Information Needed]
 - **Language(s) (NLP):** Multilingual, adapted to Galician
 - **License:** llama3.1
 - **Finetuned from model:** [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
 - **Repository:** [Adapting Large Language Models for Underrepresented Languages](https://gitlab.irlab.org/eliseo.bao/xovetic-llms-underrepresented-languages)
 - **Paper:** _Coming soon_
 [More Information Needed]
 #### Training Hyperparameters
+| Parameter                     | Value                                |
+|--------------------------------|--------------------------------------|
+| learning_rate                  | 0.0001                               |
+| train_batch_size               | 32                                   |
+| eval_batch_size                | 1                                    |
+| seed                           | 42                                   |
+| distributed_type               | multi-GPU                            |
+| num_devices                    | 4                                    |
+| gradient_accumulation_steps     | 2                                    |
+| total_train_batch_size         | 256                                  |
+| total_eval_batch_size          | 4                                    |
+| optimizer                      | Adam with betas=(0.9, 0.999), epsilon=1e-08 |
+| lr_scheduler_type              | cosine                               |
+| lr_scheduler_warmup_ratio      | 0.1                                  |
+| num_epochs                     | 1.0                                  |
 #### Training results
 - **Cloud Provider:** Private infrastructure
 - **Carbon Emitted:** 10.37 kgCO$_2$eq
 ## Citation
 _Coming soon_