XueyingJia
/

qwen-0.5b-sft-HH-online-dpo-ground-truth-lead

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

qwen-0.5b-sft-HH-online-dpo-ground-truth-lead / runs /Dec10_12-22-09_babel-0-23

1 contributor

History: 4 commits

XueyingJia's picture

Training in progress, step 400

f2619d6 verified about 2 months ago

events.out.tfevents.1733851584.babel-0-23.2524055.0

25.2 kB
LFS

Training in progress, step 400 about 2 months ago