base_model:
- meta-llama/Llama-3.1-8B-Instruct
license: llama3.1
language:
- gl
metrics:
- bleu
- rouge
model-index:
- name: Llama-3.1-8B-Instruct-Galician
results:
- task:
type: text-generation
dataset:
name: alpaca_data_galician
type: alpaca_data_galician
metrics:
- name: bleu
type: bleu-4
value: 23.13
- name: rouge
type: rouge-l
value: 21.84
pipeline_tag: text-generation
Llama-3.1-8B-Instruct-Galician
This model is a continued pretraining version of meta-llama/Llama-3.1-8B-Instruct on the CorpusNós dataset.
Model Details
Model Description
- Developed by: UDC Information Retrieval Lab (IRLab)
- Model type: [More Information Needed]
- Language(s) (NLP): Multilingual, adapted to Galician
- License: llama3.1
- Finetuned from model: meta-llama/Llama-3.1-8B-Instruct
Model Sources
- Repository: Adapting Large Language Models for Underrepresented Languages
- Paper: Coming soon
Uses
Direct Use
[More Information Needed]
Downstream Use [optional]
[More Information Needed]
Out-of-Scope Use
Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
Training Details
Training Data
[More Information Needed]
Training Procedure
Preprocessing [optional]
[More Information Needed]
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 256
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1.0
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.0606 | 0.1682 | 900 | 2.0613 |
1.9898 | 0.3363 | 1800 | 1.9929 |
1.9847 | 0.5045 | 2700 | 1.9613 |
1.9577 | 0.6726 | 3600 | 1.9445 |
1.9287 | 0.8408 | 4500 | 1.9368 |
Speeds, Sizes, Times [optional]
[More Information Needed]
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: 4x NVIDIA A100 SXM4 80 GB (TDP of 400W)
- Hours used: 60
- Cloud Provider: Private infrastructure
- Carbon Emitted: 10.37 kgCO$_2$eq
Technical Specifications [optional]
Model Architecture and Objective
[More Information Needed]
Compute Infrastructure
[More Information Needed]
Hardware
[More Information Needed]
Software
- PEFT 0.12.0
- Transformers 4.44.2
- Pytorch 2.4.0+cu121
- Datasets 2.21.0
- Tokenizers 0.19.1
Citation
BibTeX:
[More Information Needed]