Fill-Mask
Transformers
PyTorch
Safetensors
Italian
xlm-roberta
Inference Endpoints
osiria commited on
Commit
2ea1dbd
·
1 Parent(s): ce3dc48

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -32,7 +32,7 @@ widget:
32
 
33
  <h3>Introduction</h3>
34
 
35
- This model is a <b>lightweight</b> and uncased version of <b>MiniLM</b> <b>[1]</b> for the <b>italian</b> language. Its <b>16M parameters</b> and <b>66MB</b> size make it
36
  <b>85% lighter</b> than a typical mono-lingual BERT model. It is ideal when memory consumption and execution speed are critical while maintaining high-quality results.
37
 
38
 
@@ -47,7 +47,7 @@ To compensate for the deletion of cased tokens, which now forces the model to ex
47
  the model has been further pre-trained on the italian split of the [Wikipedia](https://huggingface.co/datasets/wikipedia) dataset, using the <b>whole word masking [3]</b> technique to make it more robust
48
  to the new uncased representations.
49
 
50
- The resulting model has 16M parameters, a vocabulary of 14.610 tokens, and a size of 66MB, which makes it <b>85% lighter</b> than a typical mono-lingual BERT model and
51
  75% lighter than a standard mono-lingual DistilBERT model.
52
 
53
 
 
32
 
33
  <h3>Introduction</h3>
34
 
35
+ This model is a <b>lightweight</b> and uncased version of <b>MiniLM</b> <b>[1]</b> for the <b>italian</b> language. Its <b>17M parameters</b> and <b>67MB</b> size make it
36
  <b>85% lighter</b> than a typical mono-lingual BERT model. It is ideal when memory consumption and execution speed are critical while maintaining high-quality results.
37
 
38
 
 
47
  the model has been further pre-trained on the italian split of the [Wikipedia](https://huggingface.co/datasets/wikipedia) dataset, using the <b>whole word masking [3]</b> technique to make it more robust
48
  to the new uncased representations.
49
 
50
+ The resulting model has 17M parameters, a vocabulary of 14.610 tokens, and a size of 67MB, which makes it <b>85% lighter</b> than a typical mono-lingual BERT model and
51
  75% lighter than a standard mono-lingual DistilBERT model.
52
 
53