amazingvince
/

Qwen2-VL-2B-Instruct-roastme-sample-filtered

Generated from Trainer

Model card Files Files and versions Community

amazingvince commited on Nov 6, 2024

Commit

b0d8f35

·

verified ·

1 Parent(s): 09cb5d3

Model save

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -42,8 +42,8 @@ The following hyperparameters were used during training:
 - gradient_accumulation_steps: 32
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: constant
-- lr_scheduler_warmup_ratio: 0.03
 - num_epochs: 3
 ### Training results

 - gradient_accumulation_steps: 32
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: inverse_sqrt
+- lr_scheduler_warmup_ratio: 0.05
 - num_epochs: 3
 ### Training results