rationale_model_e15

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1070

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss
2.1363 0.0954 500 2.1185
1.7868 0.1908 1000 2.1070
1.5132 0.2862 1500 2.1743
1.238 0.3815 2000 2.2694
0.9723 0.4769 2500 2.3214
0.7249 0.5723 3000 2.4423
0.5657 0.6677 3500 2.5636
0.4404 0.7631 4000 2.6851
0.3192 0.8585 4500 2.8630
0.2676 0.9538 5000 2.9741
0.2057 1.0492 5500 3.0958
0.1792 1.1446 6000 3.1219
0.1691 1.2400 6500 3.1735
0.1597 1.3354 7000 3.2299
0.1516 1.4308 7500 3.2997
0.1422 1.5261 8000 3.2759
0.1372 1.6215 8500 3.3557
0.1301 1.7169 9000 3.4023
0.1229 1.8123 9500 3.4617
0.1183 1.9077 10000 3.4668
0.1119 2.0031 10500 3.5609
0.0924 2.0984 11000 3.5975
0.0926 2.1938 11500 3.6429
0.089 2.2892 12000 3.6586
0.0881 2.3846 12500 3.6920
0.0861 2.4800 13000 3.7656
0.0835 2.5754 13500 3.7939
0.0803 2.6707 14000 3.8398
0.0797 2.7661 14500 3.8909
0.0774 2.8615 15000 3.9238
0.0759 2.9569 15500 3.9394

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.3.0
  • Datasets 2.14.4
  • Tokenizers 0.20.3
Downloads last month
9
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Heejindo/rationale_model_e15

Finetuned
(230)
this model