bertin-project
/

bertin-base-stepwise

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Pablogps commited on Jul 23, 2021

Commit

7df67a1

·

1 Parent(s): dc8597e

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -13,7 +13,9 @@ This is a **RoBERTa-base** model trained from scratch in Spanish.
 The training dataset is [mc4](https://huggingface.co/datasets/bertin-project/mc4-es-sampled ) subsampling documents to a total of about 50 million examples. Sampling is biased towards average perplexity values (defining perplexity boundaries based on quartiles), discarding more often documents with very large values (Q4, poor quality) of very small values (Q1, short, repetitive texts).
-This model has been trained for 250.000 steps.
 This is part of the
 [Flax/Jax Community Week](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104), organised by [HuggingFace](https://huggingface.co/) and TPU usage sponsored by Google.

 The training dataset is [mc4](https://huggingface.co/datasets/bertin-project/mc4-es-sampled ) subsampling documents to a total of about 50 million examples. Sampling is biased towards average perplexity values (defining perplexity boundaries based on quartiles), discarding more often documents with very large values (Q4, poor quality) of very small values (Q1, short, repetitive texts).
+This model has been trained for 180.000 steps (early stopped from 250k intended steps).
+Please see our main [card](https://huggingface.co/bertin-project/bertin-roberta-base-spanish) for more information.
 This is part of the
 [Flax/Jax Community Week](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104), organised by [HuggingFace](https://huggingface.co/) and TPU usage sponsored by Google.