Safetensors
llama
Pclanglais commited on
Commit
1836d39
·
verified ·
1 Parent(s): cce5da4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -35,7 +35,7 @@ Text generation is currently able to support a range of creative writing tasks i
35
  Pleias-3b-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
36
 
37
  ## Training
38
- Pleias-3b-Preview was fully pretrained at Jean Zay on 64 h100s for 46 hours with Nanotron, the pretraining library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
39
 
40
  Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).
41
 
 
35
  Pleias-3b-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
36
 
37
  ## Training
38
+ Pleias-3b-Preview was fully pretrained at Jean Zay on 192 h100s for about 20 days (compute grant n°GC011015451). Training code relied on Nanotron, the open source library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
39
 
40
  Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).
41