Update README.md
Browse files
README.md
CHANGED
@@ -71,4 +71,10 @@ print(response[0]["generated_text"].split("<s>Bot: ")[-1])
|
|
71 |
```
|
72 |
|
73 |
## Training & Data:
|
74 |
-
The training was done on 1 NVIDIA DGX using DeepSpeed ZeRO 3 for three epochs on roughly 4GB of carefully selected translation data. It is a full finetune of all of the model parameters.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
71 |
```
|
72 |
|
73 |
## Training & Data:
|
74 |
+
The training was done on 1 NVIDIA DGX using DeepSpeed ZeRO 3 for three epochs on roughly 4GB of carefully selected translation data. It is a full finetune of all of the model parameters.
|
75 |
+
|
76 |
+
| Epoch | Training Loss | Evaluation Loss |
|
77 |
+
|-------|---------------|-----------------|
|
78 |
+
| 1 | 1.309 | 1.281 |
|
79 |
+
| 2 | 1.161 | 1.242 |
|
80 |
+
| 3 | 1.053 | 1.219 |
|