eliseobao's picture
Update README.md
2235636 verified
|
raw
history blame
5.45 kB
metadata
base_model:
  - meta-llama/Llama-3.1-8B-Instruct
license: llama3.1
language:
  - gl
metrics:
  - bleu
  - rouge
model-index:
  - name: Llama-3.1-8B-Instruct-Galician
    results:
      - task:
          type: text-generation
        dataset:
          name: alpaca_data_galician
          type: alpaca_data_galician
        metrics:
          - name: bleu
            type: bleu-4
            value: 23.13
          - name: rouge
            type: rouge-l
            value: 21.84
pipeline_tag: text-generation

Llama-3.1-8B-Instruct-Galician

This model is a continued pretraining version of meta-llama/Llama-3.1-8B-Instruct on the CorpusNós dataset.

Model Details

Model Description

Model Sources

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.1 Community License.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 256
  • total_eval_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss
2.0606 0.1682 900 2.0613
1.9898 0.3363 1800 1.9929
1.9847 0.5045 2700 1.9613
1.9577 0.6726 3600 1.9445
1.9287 0.8408 4500 1.9368

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: 4x NVIDIA A100 SXM4 80 GB (TDP of 400W)
  • Hours used: 60
  • Cloud Provider: Private infrastructure
  • Carbon Emitted: 10.37 kgCO$_2$eq

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

  • PEFT 0.12.0
  • Transformers 4.44.2
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1

Citation

BibTeX:

[More Information Needed]