End of training

Files changed (7) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 # qwen2.5_1.5b_500k_16kcw_4ep
-This model is a fine-tuned version of [Qwen/Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0007

 # qwen2.5_1.5b_500k_16kcw_4ep
+This model is a fine-tuned version of [Qwen/Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct) on the anghabench dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0007

all_results.json CHANGED Viewed

@@ -1,12 +1,12 @@
 {
     "epoch": 4.0,
-    "eval_loss": 0.0008604346076026559,
-    "eval_runtime": 5.4299,
-    "eval_samples_per_second": 36.833,
-    "eval_steps_per_second": 9.208,
     "total_flos": 2.849849363030396e+19,
-    "train_loss": 0.002720164652114683,
-    "train_runtime": 243138.9431,
-    "train_samples_per_second": 8.043,
-    "train_steps_per_second": 1.005
 }

 {
     "epoch": 4.0,
+    "eval_loss": 0.0006838160916231573,
+    "eval_runtime": 5.2947,
+    "eval_samples_per_second": 37.774,
+    "eval_steps_per_second": 9.443,
     "total_flos": 2.849849363030396e+19,
+    "train_loss": 0.00272644101527126,
+    "train_runtime": 243116.2392,
+    "train_samples_per_second": 8.044,
+    "train_steps_per_second": 1.006
 }

eval_results.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
     "epoch": 4.0,
-    "eval_loss": 0.0008604346076026559,
-    "eval_runtime": 5.4299,
-    "eval_samples_per_second": 36.833,
-    "eval_steps_per_second": 9.208
 }

 {
     "epoch": 4.0,
+    "eval_loss": 0.0006838160916231573,
+    "eval_runtime": 5.2947,
+    "eval_samples_per_second": 37.774,
+    "eval_steps_per_second": 9.443
 }

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 4.0,
     "total_flos": 2.849849363030396e+19,
-    "train_loss": 0.002720164652114683,
-    "train_runtime": 243138.9431,
-    "train_samples_per_second": 8.043,
-    "train_steps_per_second": 1.005
 }

 {
     "epoch": 4.0,
     "total_flos": 2.849849363030396e+19,
+    "train_loss": 0.00272644101527126,
+    "train_runtime": 243116.2392,
+    "train_samples_per_second": 8.044,
+    "train_steps_per_second": 1.006
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_eval_loss.png CHANGED Viewed

training_loss.png CHANGED Viewed