hugosousa's picture
End of training
f357d5a verified
|
raw
history blame
3.49 kB
metadata
library_name: transformers
license: apache-2.0
base_model: HuggingFaceTB/SmolLM2-360M
tags:
  - generated_from_trainer
metrics:
  - f1
model-index:
  - name: SmolLM2-360M-TemporalQuestions
    results: []

SmolLM2-360M-TemporalQuestions

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-360M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0257
  • F1: 0.9846

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 1024
  • total_eval_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss F1
0.086 1.0 223 0.0629 0.9514
0.1263 2.0 446 0.0466 0.9647
0.0172 3.0 669 0.0351 0.9745
0.0729 4.0 892 0.0319 0.9770
0.0254 5.0 1115 0.0320 0.9788
0.0258 6.0 1338 0.0288 0.9798
0.017 7.0 1561 0.0302 0.9812
0.0278 8.0 1784 0.0302 0.9807
0.0105 9.0 2007 0.0338 0.9797
0.0503 10.0 2230 0.0297 0.9808
0.0148 11.0 2453 0.0257 0.9846
0.0005 12.0 2676 0.0305 0.9822
0.0052 13.0 2899 0.0282 0.9853
0.0012 14.0 3122 0.0317 0.9837
0.0095 15.0 3345 0.0338 0.9859
0.0004 16.0 3568 0.0307 0.9865
0.0003 17.0 3791 0.0336 0.9856
0.0074 18.0 4014 0.0338 0.9855
0.0003 19.0 4237 0.0327 0.9864
0.0003 20.0 4460 0.0353 0.9858
0.0001 21.0 4683 0.0377 0.9858
0.0001 22.0 4906 0.0380 0.9870
0.0001 23.0 5129 0.0389 0.9866
0.0001 24.0 5352 0.0399 0.9866
0.0001 25.0 5575 0.0404 0.9866
0.0001 26.0 5798 0.0408 0.9867
0.0001 27.0 6021 0.0409 0.9867
0.0002 28.0 6244 0.0411 0.9867
0.0001 29.0 6467 0.0411 0.9867
0.0023 29.8691 6660 0.0412 0.9867

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.0.1
  • Tokenizers 0.21.0