CocoRoF's picture
modu_bf2018_chunk_03 Done
34b994f verified
|
raw
history blame
2.13 kB
metadata
library_name: transformers
license: apache-2.0
base_model: x2bee/KoModernBERT-base-mlm-v02-ckp02
tags:
  - generated_from_trainer
model-index:
  - name: KoModernBERT-base-mlm-v02-ckp02
    results: []

KoModernBERT-base-mlm-v02-ckp02

This model is a fine-tuned version of x2bee/KoModernBERT-base-mlm-v02-ckp02 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6437

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 512
  • total_eval_batch_size: 64
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
14.3633 0.0986 3000 1.7944
14.0205 0.1973 6000 1.7638
14.0391 0.2959 9000 1.7430
13.8014 0.3946 12000 1.7255
13.6803 0.4932 15000 1.7118
13.5763 0.5919 18000 1.6961
13.4827 0.6905 21000 1.6824
13.3855 0.7892 24000 1.6700
13.2238 0.8878 27000 1.6558
13.0954 0.9865 30000 1.6437

Framework versions

  • Transformers 4.48.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0