Update README.md
Browse files
README.md
CHANGED
@@ -32,7 +32,7 @@ widget:
|
|
32 |
|
33 |
<h3>Introduction</h3>
|
34 |
|
35 |
-
This model is a <b>lightweight</b> and uncased version of <b>MiniLM</b> <b>[1]</b> for the <b>italian</b> language. Its <b>
|
36 |
<b>85% lighter</b> than a typical mono-lingual BERT model. It is ideal when memory consumption and execution speed are critical while maintaining high-quality results.
|
37 |
|
38 |
|
@@ -47,7 +47,7 @@ To compensate for the deletion of cased tokens, which now forces the model to ex
|
|
47 |
the model has been further pre-trained on the italian split of the [Wikipedia](https://huggingface.co/datasets/wikipedia) dataset, using the <b>whole word masking [3]</b> technique to make it more robust
|
48 |
to the new uncased representations.
|
49 |
|
50 |
-
The resulting model has
|
51 |
75% lighter than a standard mono-lingual DistilBERT model.
|
52 |
|
53 |
|
|
|
32 |
|
33 |
<h3>Introduction</h3>
|
34 |
|
35 |
+
This model is a <b>lightweight</b> and uncased version of <b>MiniLM</b> <b>[1]</b> for the <b>italian</b> language. Its <b>17M parameters</b> and <b>67MB</b> size make it
|
36 |
<b>85% lighter</b> than a typical mono-lingual BERT model. It is ideal when memory consumption and execution speed are critical while maintaining high-quality results.
|
37 |
|
38 |
|
|
|
47 |
the model has been further pre-trained on the italian split of the [Wikipedia](https://huggingface.co/datasets/wikipedia) dataset, using the <b>whole word masking [3]</b> technique to make it more robust
|
48 |
to the new uncased representations.
|
49 |
|
50 |
+
The resulting model has 17M parameters, a vocabulary of 14.610 tokens, and a size of 67MB, which makes it <b>85% lighter</b> than a typical mono-lingual BERT model and
|
51 |
75% lighter than a standard mono-lingual DistilBERT model.
|
52 |
|
53 |
|