FatCat87
/

taopanda-3_c71cc5c5-b4a1-43b5-93ac-4f9fe77ee451

Generated from Trainer

8-bit precision

Model card Files Files and versions Community

FatCat87 commited on 21 days ago

Commit

7ac780e

·

verified ·

1 Parent(s): e1bced1

End of training

Files changed (2) hide show

README.md +6 -6
adapter_model.bin +1 -1

README.md CHANGED Viewed

@@ -71,7 +71,7 @@ pad_to_sequence_len: true
 resume_from_checkpoint: null
 sample_packing: true
 saves_per_epoch: 1
-seed: 88434
 sequence_len: 4096
 special_tokens: null
 strict: false
@@ -95,12 +95,12 @@ xformers_attention: null
 </details><br>
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/qqnrsu59)
 # taopanda-3_c71cc5c5-b4a1-43b5-93ac-4f9fe77ee451
 This model is a fine-tuned version of [unsloth/Qwen2-0.5B](https://huggingface.co/unsloth/Qwen2-0.5B) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.8119
 ## Model description
@@ -122,7 +122,7 @@ The following hyperparameters were used during training:
 - learning_rate: 0.0002
 - train_batch_size: 2
 - eval_batch_size: 2
-- seed: 88434
 - distributed_type: multi-GPU
 - num_devices: 4
 - gradient_accumulation_steps: 4
@@ -136,8 +136,8 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.4564        | 1.0    | 1    | 1.9090          |
-| 0.431         | 1.3333 | 2    | 1.8119          |
 ### Framework versions

 resume_from_checkpoint: null
 sample_packing: true
 saves_per_epoch: 1
+seed: 54308
 sequence_len: 4096
 special_tokens: null
 strict: false
 </details><br>
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/ozxli2sy)
 # taopanda-3_c71cc5c5-b4a1-43b5-93ac-4f9fe77ee451
 This model is a fine-tuned version of [unsloth/Qwen2-0.5B](https://huggingface.co/unsloth/Qwen2-0.5B) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.8497
 ## Model description
 - learning_rate: 0.0002
 - train_batch_size: 2
 - eval_batch_size: 2
+- seed: 54308
 - distributed_type: multi-GPU
 - num_devices: 4
 - gradient_accumulation_steps: 4
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 1.9997        | 1.0    | 1    | 1.9384          |
+| 0.4516        | 1.3333 | 2    | 1.8497          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:46a8544ba46c39c6d2ecc72f920ced8d428a62e065b3fc9fecfa929276a7bc85
 size 70506570

 version https://git-lfs.github.com/spec/v1
+oid sha256:8e5900f919a77e8e1cc62c4a1bd434cc05d014e74ba115a9ece85125d921a87c
 size 70506570