cservan commited on
Commit
3c0fe0d
·
1 Parent(s): 05a2574

Add README

Browse files
Files changed (1) hide show
  1. README.md +24 -53
README.md CHANGED
@@ -41,6 +41,7 @@ This model has the following configuration:
41
  - 768 hidden dimension
42
  - 12 attention heads
43
  - 11M parameters
 
44
 
45
  ## Intended uses & limitations
46
 
@@ -54,46 +55,6 @@ generation you should look at model like GPT2.
54
 
55
  ### How to use
56
 
57
- You can use this model directly with a pipeline for masked language modeling:
58
-
59
- ```python
60
- >>> from transformers import pipeline
61
- >>> unmasker = pipeline('fill-mask', model='cservan/malbert-base-cased-32k')
62
- >>> unmasker("Hello I'm a [MASK] model.")
63
- [
64
- {
65
- "sequence": "paris est la capitale de la france.",
66
- "score": 0.6231236457824707,
67
- "token": 3043,
68
- "token_str": "france"
69
- },
70
- {
71
- "sequence": "paris est la capitale de la region.",
72
- "score": 0.2993471622467041,
73
- "token": 10531,
74
- "token_str": "region"
75
- },
76
- {
77
- "sequence": "paris est la capitale de la societe.",
78
- "score": 0.02028230018913746,
79
- "token": 24622,
80
- "token_str": "societe"
81
- },
82
- {
83
- "sequence": "paris est la capitale de la bretagne.",
84
- "score": 0.012089950032532215,
85
- "token": 24987,
86
- "token_str": "bretagne"
87
- },
88
- {
89
- "sequence": "paris est la capitale de la chine.",
90
- "score": 0.010002839379012585,
91
- "token": 14860,
92
- "token_str": "chine"
93
- }
94
- ]
95
- ```
96
-
97
  Here is how to use this model to get the features of a given text in PyTorch:
98
 
99
  ```python
@@ -149,25 +110,35 @@ When fine-tuned on downstream tasks, the ALBERT models achieve the following res
149
 
150
  Slot-filling:
151
 
152
- | | mALBERT-base | mALBERT-base-cased
153
- |----------------|---------------|--------------------
154
- | MEDIA | 81.76 (0.59) | 85.09 (0.14)
155
- |
 
 
 
 
 
156
 
 
 
 
 
 
157
 
158
  ### BibTeX entry and citation info
159
 
160
  ```bibtex
161
- @inproceedings{cattan2021fralbert,
162
- author = {Oralie Cattan and
163
- Christophe Servan and
164
  Sophie Rosset},
165
- booktitle = {Recent Advances in Natural Language Processing, RANLP 2021},
166
- title = {{On the Usability of Transformers-based models for a French Question-Answering task}},
167
- year = {2021},
168
- address = {Online},
169
- month = sep,
170
  }
171
  ```
172
 
173
- Link to the paper: [PDF](https://hal.archives-ouvertes.fr/hal-03336060)
 
41
  - 768 hidden dimension
42
  - 12 attention heads
43
  - 11M parameters
44
+ - 32k of vocabulary size
45
 
46
  ## Intended uses & limitations
47
 
 
55
 
56
  ### How to use
57
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
  Here is how to use this model to get the features of a given text in PyTorch:
59
 
60
  ```python
 
110
 
111
  Slot-filling:
112
 
113
+ |Models ⧹ Tasks | MMNLU | MultiATIS++ | CoNLL2003 | MultiCoNER | SNIPS | MEDIA |
114
+ |---------------|--------------|--------------|--------------|--------------|--------------|--------------|
115
+ |EnALBERT | N/A | N/A | 89.67 (0.34) | 42.36 (0.22) | 95.95 (0.13) | N/A |
116
+ |FrALBERT | N/A | N/A | N/A | N/A | N/A | 81.76 (0.59)
117
+ |mALBERT-128k | 65.81 (0.11) | 89.14 (0.15) | 88.27 (0.24) | 46.01 (0.18) | 91.60 (0.31) | 83.15 (0.38) |
118
+ |mALBERT-64k | 65.29 (0.14) | 88.88 (0.14) | 86.44 (0.37) | 44.70 (0.27) | 90.84 (0.47) | 82.30 (0.19) |
119
+ |mALBERT-32k | 64.83 (0.22) | 88.60 (0.27) | 84.96 (0.41) | 44.13 (0.39) | 89.89 (0.68) | 82.04 (0.28) |
120
+
121
+ Classification task:
122
 
123
+ |Models ⧹ Tasks | MMNLU | MultiATIS++ | SNIPS | SST2 |
124
+ |---------------|--------------|--------------|--------------|--------------|
125
+ |mALBERT-128k | 72.35 (0.09) | 90.58 (0.98) | 96.84 (0.49) | 34.66 (1.46) |
126
+ |mALBERT-64k | 71.26 (0.11) | 90.97 (0.70) | 96.53 (0.44) | 34.64 (1.02) |
127
+ |mALBERT-32k | 70.76 (0.11) | 90.55 (0.98) | 96.49 (0.45) | 34.18 (1.64) |
128
 
129
  ### BibTeX entry and citation info
130
 
131
  ```bibtex
132
+ @inproceedings{servan2024mALBERT,
133
+ author = {Christophe Servan and
134
+ Sahar Ghannay and
135
  Sophie Rosset},
136
+ booktitle = {the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
137
+ title = {{mALBERT: Is a Compact Multilingual BERT Model Still Worth It?}},
138
+ year = {2024},
139
+ address = {Torino, Italy},
140
+ month = may,
141
  }
142
  ```
143
 
144
+ Link to the paper: [PDF](https://hal.science/hal-04520797)