cmarkea
/

bloomz-3b-sft-chat

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Cyrile commited on Sep 20, 2023

Commit

f74cc84

·

1 Parent(s): eefdd5c

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -52,7 +52,7 @@ Here is the table summarizing the architecture used for training, along with the
 |     Hyperparameter    |    Value   |
 |:---------------------:|:----------:|
 |       label smoothing | 0.05       |
-|              optimize | AdamW      |
 |                 betas | 0.9, 0.999 |
 |         learning rate | 1e-5       |
 |       anneal strategy | cos        |
@@ -108,6 +108,6 @@ Citation
   AUTHOR = {Cyrile Delestre},
   URL = {https://huggingface.co/cmarkea/bloomz-3b-sft-chat},
   YEAR = {2023},
-  KEYWORDS = {NLP ; Transformers ; Bloomz},
 }
 ```

 |     Hyperparameter    |    Value   |
 |:---------------------:|:----------:|
 |       label smoothing | 0.05       |
+|             optimizer | AdamW      |
 |                 betas | 0.9, 0.999 |
 |         learning rate | 1e-5       |
 |       anneal strategy | cos        |
   AUTHOR = {Cyrile Delestre},
   URL = {https://huggingface.co/cmarkea/bloomz-3b-sft-chat},
   YEAR = {2023},
+  KEYWORDS = {NLP ; Transformers ; LLM ; Bloomz},
 }
 ```