CocoRoF's picture
modu_bf2018_chunk_03 Done
f893c7d verified
|
raw
history blame
2.18 kB
metadata
library_name: transformers
license: apache-2.0
base_model: x2bee/KoModernBERT-base-mlm-v02-ckp02
tags:
  - generated_from_trainer
model-index:
  - name: KoModernBERT-base-mlm-v02-ckp02
    results: []

KoModernBERT-base-mlm-v02-ckp02

This model is a fine-tuned version of x2bee/KoModernBERT-base-mlm-v02-ckp02 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9006

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 512
  • total_eval_batch_size: 64
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
17.1231 0.0867 3000 2.1503
16.9213 0.1734 6000 2.1170
16.6843 0.2601 9000 2.0872
16.501 0.3468 12000 2.0641
16.2914 0.4335 15000 2.0396
16.1829 0.5201 18000 2.0157
15.9756 0.6068 21000 1.9904
15.7217 0.6935 24000 1.9681
15.5407 0.7802 27000 1.9437
15.389 0.8669 30000 1.9219
15.1363 0.9536 33000 1.9006

Framework versions

  • Transformers 4.48.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0