PEFT
Safetensors
qwen2
alignment-handbook
trl
dpo
Generated from Trainer
khongtrunght's picture
Model save
5405756 verified
raw
history blame contribute delete
215 Bytes
{
"epoch": 1.0,
"total_flos": 0.0,
"train_loss": 0.39260031933687173,
"train_runtime": 7916.79,
"train_samples": 31353,
"train_samples_per_second": 3.96,
"train_steps_per_second": 0.124
}