01GangaPutraBheeshma
commited on
Commit
·
156eb2a
1
Parent(s):
a305790
Update README.md
Browse files
README.md
CHANGED
@@ -169,21 +169,24 @@ trainer = SFTTrainer(
|
|
169 |
)
|
170 |
```
|
171 |
|
172 |
-
|
173 |
-
|
174 |
-
|
175 |
-
|
176 |
-
|
177 |
-
|
178 |
-
|
179 |
-
|
180 |
-
|
181 |
-
|
182 |
-
|
183 |
-
|
184 |
-
|
185 |
-
|
186 |
-
|
|
|
|
|
|
|
187 |
|
188 |
### Training Data
|
189 |
|
@@ -191,17 +194,19 @@ push_to_hub: Push the trained model to the Hugging Face Model Hub.
|
|
191 |
|
192 |
#### Metrics
|
193 |
|
194 |
-
Step
|
195 |
-
|
196 |
-
|
197 |
-
|
198 |
-
|
199 |
-
|
200 |
-
|
201 |
-
|
202 |
-
|
203 |
-
|
204 |
-
|
|
|
|
|
205 |
|
206 |
### Results
|
207 |
|
|
|
169 |
)
|
170 |
```
|
171 |
|
172 |
+
| Parameter | Description |
|
173 |
+
|-------------------------------|------------------------------------------------------------------|
|
174 |
+
| `output_dir` | Directory to save the trained model and logs. |
|
175 |
+
| `per_device_train_batch_size` | Number of training samples per GPU. |
|
176 |
+
| `gradient_accumulation_steps` | Number of steps to accumulate gradients before updating the model.|
|
177 |
+
| `optim` | Optimizer for training (e.g., "paged_adamw_32bit"). |
|
178 |
+
| `save_steps` | Save model checkpoints every N steps. |
|
179 |
+
| `logging_steps` | Log training information every N steps. |
|
180 |
+
| `learning_rate` | Initial learning rate for training. |
|
181 |
+
| `max_grad_norm` | Maximum gradient norm for gradient clipping. |
|
182 |
+
| `max_steps` | Maximum number of training steps. |
|
183 |
+
| `warmup_ratio` | Ratio of warmup steps during learning rate warmup. |
|
184 |
+
| `lr_scheduler_type` | Type of learning rate scheduler (e.g., "constant"). |
|
185 |
+
| `fp16` | Enable mixed-precision training. |
|
186 |
+
| `group_by_length` | Group training samples by length for efficiency. |
|
187 |
+
| `ddp_find_unused_parameters` | Enable distributed training parameter setting. |
|
188 |
+
| `push_to_hub` | Push the trained model to the Hugging Face Model Hub. |
|
189 |
+
|
190 |
|
191 |
### Training Data
|
192 |
|
|
|
194 |
|
195 |
#### Metrics
|
196 |
|
197 |
+
| Step | Training Loss |
|
198 |
+
|-------|---------------|
|
199 |
+
| 100 | 2.189900 |
|
200 |
+
| 200 | 2.014100 |
|
201 |
+
| 300 | 1.957200 |
|
202 |
+
| 400 | 1.990000 |
|
203 |
+
| 500 | 1.985200 |
|
204 |
+
| 600 | 1.986500 |
|
205 |
+
| 700 | 1.964300 |
|
206 |
+
| 800 | 1.951900 |
|
207 |
+
| 900 | 1.936900 |
|
208 |
+
| 1000 | 2.011200 |
|
209 |
+
|
210 |
|
211 |
### Results
|
212 |
|