AI-Sweden-Models
/

gpt-sw3-6.7b-v2-translator

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

timpal0l commited on Apr 2, 2024

Commit

99f6c76

·

verified ·

1 Parent(s): 2517300

Update README.md

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -71,4 +71,10 @@ print(response[0]["generated_text"].split("<s>Bot: ")[-1])
 ```
 ## Training & Data:
-The training was done on 1 NVIDIA DGX using DeepSpeed ZeRO 3 for three epochs on roughly 4GB of carefully selected translation data. It is a full finetune of all of the model parameters.

 ```
 ## Training & Data:
+The training was done on 1 NVIDIA DGX using DeepSpeed ZeRO 3 for three epochs on roughly 4GB of carefully selected translation data. It is a full finetune of all of the model parameters.
+| Epoch | Training Loss | Evaluation Loss |
+|-------|---------------|-----------------|
+| 1     | 1.309         | 1.281           |
+| 2     | 1.161         | 1.242           |
+| 3     | 1.053         | 1.219           |