PleIAs
/

Pleias-3b-Preview

Model card Files Files and versions Community

Pclanglais commited on Dec 2, 2024

Commit

1836d39

·

verified ·

1 Parent(s): cce5da4

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -35,7 +35,7 @@ Text generation is currently able to support a range of creative writing tasks i
 Pleias-3b-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
 ## Training
-Pleias-3b-Preview was fully pretrained at Jean Zay on 64 h100s for 46 hours with Nanotron, the pretraining library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
 Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).

 Pleias-3b-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
 ## Training
+Pleias-3b-Preview was fully pretrained at Jean Zay on 192 h100s for about 20 days (compute grant n°GC011015451). Training code relied on Nanotron, the open source library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
 Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).