End of training

Browse files

Files changed (4) hide show

README.md +21 -33
model-00001-of-00002.safetensors +1 -1
model-00002-of-00002.safetensors +1 -1
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -6,19 +6,19 @@ tags:
 - sft
 - generated_from_trainer
 model-index:
-- name: collapse_gemma-2-2b_hs2_iter2_sftsd0
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# collapse_gemma-2-2b_hs2_iter2_sftsd0
 This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.5115
-- Num Input Tokens Seen: 7923536
 ## Model description
@@ -52,35 +52,23 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss | Input Tokens Seen |
 |:-------------:|:------:|:----:|:---------------:|:-----------------:|
-| No log        | 0      | 0    | 1.3956          | 0                 |
-| 1.7735        | 0.0350 | 5    | 1.3066          | 280096            |
-| 1.5215        | 0.0700 | 10   | 1.1998          | 560120            |
-| 1.3668        | 0.1050 | 15   | 1.1725          | 838176            |
-| 1.0619        | 0.1400 | 20   | 1.1790          | 1120704           |
-| 0.917         | 0.1750 | 25   | 1.2479          | 1397888           |
-| 0.9118        | 0.2100 | 30   | 1.3144          | 1680224           |
-| 0.7159        | 0.2450 | 35   | 1.3931          | 1963544           |
-| 0.5111        | 0.2800 | 40   | 1.4439          | 2241792           |
-| 0.4749        | 0.3150 | 45   | 1.5136          | 2518608           |
-| 0.427         | 0.3500 | 50   | 1.5106          | 2799872           |
-| 0.3428        | 0.3850 | 55   | 1.5751          | 3084560           |
-| 0.3927        | 0.4199 | 60   | 1.4907          | 3368728           |
-| 0.2933        | 0.4549 | 65   | 1.5076          | 3648312           |
-| 0.249         | 0.4899 | 70   | 1.4746          | 3928200           |
-| 0.2253        | 0.5249 | 75   | 1.4913          | 4211080           |
-| 0.1422        | 0.5599 | 80   | 1.4445          | 4488088           |
-| 0.1286        | 0.5949 | 85   | 1.5182          | 4763072           |
-| 0.1044        | 0.6299 | 90   | 1.4204          | 5043448           |
-| 0.19          | 0.6649 | 95   | 1.4679          | 5318848           |
-| 0.1548        | 0.6999 | 100  | 1.4739          | 5601360           |
-| 0.1394        | 0.7349 | 105  | 1.4093          | 5877032           |
-| 0.1386        | 0.7699 | 110  | 1.4460          | 6162712           |
-| 0.1775        | 0.8049 | 115  | 1.4499          | 6435944           |
-| 0.2135        | 0.8399 | 120  | 1.4051          | 6717936           |
-| 0.1515        | 0.8749 | 125  | 1.5017          | 6994336           |
-| 0.1906        | 0.9099 | 130  | 1.4869          | 7270544           |
-| 0.1433        | 0.9449 | 135  | 1.4074          | 7542248           |
-| 0.1096        | 0.9799 | 140  | 1.4848          | 7811456           |
 ### Framework versions

 - sft
 - generated_from_trainer
 model-index:
+- name: collapse_gemma-2-2b_hs2_replace_iter2_sftsd0
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# collapse_gemma-2-2b_hs2_replace_iter2_sftsd0
 This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.4538
+- Num Input Tokens Seen: 4832464
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss | Input Tokens Seen |
 |:-------------:|:------:|:----:|:---------------:|:-----------------:|
+| No log        | 0      | 0    | 1.3909          | 0                 |
+| 1.6784        | 0.0591 | 5    | 1.2633          | 282096            |
+| 1.3537        | 0.1183 | 10   | 1.1871          | 571576            |
+| 1.0696        | 0.1774 | 15   | 1.2164          | 857160            |
+| 0.9162        | 0.2365 | 20   | 1.2391          | 1142344           |
+| 0.7598        | 0.2956 | 25   | 1.3479          | 1427536           |
+| 0.5372        | 0.3548 | 30   | 1.4227          | 1715736           |
+| 0.4796        | 0.4139 | 35   | 1.4737          | 2003760           |
+| 0.3889        | 0.4730 | 40   | 1.5021          | 2286384           |
+| 0.1994        | 0.5322 | 45   | 1.5032          | 2573248           |
+| 0.3391        | 0.5913 | 50   | 1.4714          | 2862104           |
+| 0.3297        | 0.6504 | 55   | 1.4358          | 3145472           |
+| 0.2038        | 0.7095 | 60   | 1.4488          | 3432144           |
+| 0.195         | 0.7687 | 65   | 1.4273          | 3724448           |
+| 0.1749        | 0.8278 | 70   | 1.4248          | 4016736           |
+| 0.1654        | 0.8869 | 75   | 1.4554          | 4305224           |
+| 0.1846        | 0.9460 | 80   | 1.4274          | 4595952           |
 ### Framework versions

model-00001-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:efd07ab3cc40ce9bd97b1836c678f8d3c8efe2ae48cefb0cf5560f5508716fa3
 size 4988025760

 version https://git-lfs.github.com/spec/v1
+oid sha256:c8923d0ce24fc19ff925f57ba737262a3b651f7af10da5b3c3708d73d6a013fc
 size 4988025760

model-00002-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c4d872fc598d532fb81b05d66ad3e33f3527de8f56b89024f97762d4fa512976
 size 240691728

 version https://git-lfs.github.com/spec/v1
+oid sha256:8884483d7b8b59dc8c03fd2f12897e7e5088e654072e12c531ece63cc55a75ba
 size 240691728

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:364f58ab83df95b858312c13228fb9cf63c5f59048e3967577ee0dc99c331f87
-size 5560

 version https://git-lfs.github.com/spec/v1
+oid sha256:99b64a8f734610c1930219e035f3328b78f014bc58cffac3230063c0fa0f529c
+size 5624