cservan
/

malbert-base-cased-32k

@@ -41,6 +41,7 @@ This model has the following configuration:
 - 768 hidden dimension
 - 12 attention heads
 - 11M parameters
 ## Intended uses & limitations
@@ -54,46 +55,6 @@ generation you should look at model like GPT2.
 ### How to use
-You can use this model directly with a pipeline for masked language modeling:
-```python
->>> from transformers import pipeline
->>> unmasker = pipeline('fill-mask', model='cservan/malbert-base-cased-32k')
->>> unmasker("Hello I'm a [MASK] model.")
-[
-  {
-    "sequence": "paris est la capitale de la france.",
-    "score": 0.6231236457824707,
-    "token": 3043,
-    "token_str": "france"
-  },
-  {
-    "sequence": "paris est la capitale de la region.",
-    "score": 0.2993471622467041,
-    "token": 10531,
-    "token_str": "region"
-  },
-  {
-    "sequence": "paris est la capitale de la societe.",
-    "score": 0.02028230018913746,
-    "token": 24622,
-    "token_str": "societe"
-  },
-  {
-    "sequence": "paris est la capitale de la bretagne.",
-    "score": 0.012089950032532215,
-    "token": 24987,
-    "token_str": "bretagne"
-  },
-  {
-    "sequence": "paris est la capitale de la chine.",
-    "score": 0.010002839379012585,
-    "token": 14860,
-    "token_str": "chine"
-  }
-]
-```
 Here is how to use this model to get the features of a given text in PyTorch:
 ```python
@@ -149,25 +110,35 @@ When fine-tuned on downstream tasks, the ALBERT models achieve the following res
 Slot-filling:
-|                | mALBERT-base | mALBERT-base-cased
-|----------------|---------------|--------------------
-| MEDIA          | 81.76 (0.59)  | 85.09 (0.14)
-|
 ### BibTeX entry and citation info
 ```bibtex
-@inproceedings{cattan2021fralbert,
-  author    = {Oralie Cattan and
-               Christophe Servan and
                Sophie Rosset},
-  booktitle = {Recent Advances in Natural Language Processing, RANLP 2021},
-  title     = {{On the Usability of Transformers-based models for a French Question-Answering task}},
-  year      = {2021},
-  address   = {Online},
-  month     = sep,
 }
 ```
-Link to the paper: [PDF](https://hal.archives-ouvertes.fr/hal-03336060)

 - 768 hidden dimension
 - 12 attention heads
 - 11M parameters
+- 32k of vocabulary size
 ## Intended uses & limitations
 ### How to use
 Here is how to use this model to get the features of a given text in PyTorch:
 ```python
 Slot-filling:
+|Models ⧹ Tasks |  MMNLU |  MultiATIS++ |  CoNLL2003 |  MultiCoNER |  SNIPS |  MEDIA |
+|---------------|--------------|--------------|--------------|--------------|--------------|--------------|
+|EnALBERT |  N/A |  N/A |  89.67 (0.34) |  42.36 (0.22) |  95.95 (0.13) |  N/A |
+|FrALBERT |  N/A |  N/A |  N/A |  N/A |  N/A |  81.76 (0.59)
+|mALBERT-128k |  65.81 (0.11) |  89.14 (0.15) |  88.27 (0.24) |  46.01 (0.18) |  91.60 (0.31) |  83.15 (0.38) |
+|mALBERT-64k  |  65.29 (0.14) |  88.88 (0.14) |  86.44 (0.37) |  44.70 (0.27) |  90.84 (0.47) |  82.30 (0.19) |
+|mALBERT-32k  |  64.83 (0.22) |  88.60 (0.27) |  84.96 (0.41) |  44.13 (0.39) |  89.89 (0.68) |  82.04 (0.28) |
+Classification task:
+|Models ⧹ Tasks | MMNLU | MultiATIS++ | SNIPS | SST2 |
+|---------------|--------------|--------------|--------------|--------------|
+|mALBERT-128k | 72.35 (0.09) | 90.58 (0.98) | 96.84 (0.49) | 34.66 (1.46) |
+|mALBERT-64k  | 71.26 (0.11) | 90.97 (0.70) | 96.53 (0.44) | 34.64 (1.02) |
+|mALBERT-32k  | 70.76 (0.11) | 90.55 (0.98) | 96.49 (0.45) | 34.18 (1.64) |
 ### BibTeX entry and citation info
 ```bibtex
+@inproceedings{servan2024mALBERT,
+  author    = {Christophe Servan and
+               Sahar Ghannay and
                Sophie Rosset},
+  booktitle = {the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
+  title     = {{mALBERT: Is a Compact Multilingual BERT Model Still Worth It?}},
+  year      = {2024},
+  address   = {Torino, Italy},
+  month     = may,
 }
 ```
+Link to the paper: [PDF](https://hal.science/hal-04520797)