|
--- |
|
tags: |
|
- mms |
|
language: |
|
- ab |
|
- af |
|
- ak |
|
- am |
|
- ar |
|
- as |
|
- av |
|
- ay |
|
- az |
|
- ba |
|
- bm |
|
- be |
|
- bn |
|
- bi |
|
- bo |
|
- sh |
|
- br |
|
- bg |
|
- ca |
|
- cs |
|
- ce |
|
- cv |
|
- ku |
|
- cy |
|
- da |
|
- de |
|
- dv |
|
- dz |
|
- el |
|
- en |
|
- eo |
|
- et |
|
- eu |
|
- ee |
|
- fo |
|
- fa |
|
- fj |
|
- fi |
|
- fr |
|
- fy |
|
- ff |
|
- ga |
|
- gl |
|
- gn |
|
- gu |
|
- zh |
|
- ht |
|
- ha |
|
- he |
|
- hi |
|
- sh |
|
- hu |
|
- hy |
|
- ig |
|
- ia |
|
- ms |
|
- is |
|
- it |
|
- jv |
|
- ja |
|
- kn |
|
- ka |
|
- kk |
|
- kr |
|
- km |
|
- ki |
|
- rw |
|
- ky |
|
- ko |
|
- kv |
|
- lo |
|
- la |
|
- lv |
|
- ln |
|
- lt |
|
- lb |
|
- lg |
|
- mh |
|
- ml |
|
- mr |
|
- ms |
|
- mk |
|
- mg |
|
- mt |
|
- mn |
|
- mi |
|
- my |
|
- zh |
|
- nl |
|
- 'no' |
|
- 'no' |
|
- ne |
|
- ny |
|
- oc |
|
- om |
|
- or |
|
- os |
|
- pa |
|
- pl |
|
- pt |
|
- ms |
|
- ps |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- qu |
|
- ro |
|
- rn |
|
- ru |
|
- sg |
|
- sk |
|
- sl |
|
- sm |
|
- sn |
|
- sd |
|
- so |
|
- es |
|
- sq |
|
- su |
|
- sv |
|
- sw |
|
- ta |
|
- tt |
|
- te |
|
- tg |
|
- tl |
|
- th |
|
- ti |
|
- ts |
|
- tr |
|
- uk |
|
- ms |
|
- vi |
|
- wo |
|
- xh |
|
- ms |
|
- yo |
|
- ms |
|
- zu |
|
- za |
|
license: cc-by-nc-4.0 |
|
datasets: |
|
- google/fleurs |
|
metrics: |
|
- acc |
|
--- |
|
|
|
# Massively Multilingual Speech (MMS) - Finetuned LID |
|
|
|
This checkpoint is a model fine-tuned for speech language identification (LID) and part of Facebook's [Massive Multilingual Speech project](https://research.facebook.com/publications/scaling-speech-technology-to-1000-languages/). |
|
This checkpoint is based on the [Wav2Vec2 architecture](https://huggingface.co/docs/transformers/model_doc/wav2vec2) and classifies raw audio input to a probability distribution over 2048 output classes (each class representing a language). |
|
The checkpoint consists of **1 billion parameters** and has been fine-tuned from [facebook/mms-1b](https://huggingface.co/facebook/mms-1b) on 2048 languages. |
|
|
|
## Table Of Content |
|
|
|
- [Example](#example) |
|
- [Supported Languages](#supported-languages) |
|
- [Model details](#model-details) |
|
- [Additional links](#additional-links) |
|
|
|
## Example |
|
|
|
This MMS checkpoint can be used with [Transformers](https://github.com/huggingface/transformers) to identify |
|
the spoken language of an audio. It can recognize the [following 2048 languages](#supported-languages). |
|
|
|
Let's look at a simple example. |
|
|
|
First, we install transformers and some other libraries |
|
``` |
|
pip install torch accelerate torchaudio datasets |
|
pip install --upgrade transformers |
|
```` |
|
|
|
**Note**: In order to use MMS you need to have at least `transformers >= 4.30` installed. If the `4.30` version |
|
is not yet available [on PyPI](https://pypi.org/project/transformers/) make sure to install `transformers` from |
|
source: |
|
``` |
|
pip install git+https://github.com/huggingface/transformers.git |
|
``` |
|
|
|
Next, we load a couple of audio samples via `datasets`. Make sure that the audio data is sampled to 16000 kHz. |
|
|
|
```py |
|
from datasets import load_dataset, Audio |
|
|
|
# English |
|
stream_data = load_dataset("mozilla-foundation/common_voice_13_0", "en", split="test", streaming=True) |
|
stream_data = stream_data.cast_column("audio", Audio(sampling_rate=16000)) |
|
en_sample = next(iter(stream_data))["audio"]["array"] |
|
|
|
# Arabic |
|
stream_data = load_dataset("mozilla-foundation/common_voice_13_0", "ar", split="test", streaming=True) |
|
stream_data = stream_data.cast_column("audio", Audio(sampling_rate=16000)) |
|
ar_sample = next(iter(stream_data))["audio"]["array"] |
|
``` |
|
|
|
Next, we load the model and processor |
|
|
|
```py |
|
from transformers import Wav2Vec2ForSequenceClassification, AutoFeatureExtractor |
|
import torch |
|
|
|
model_id = "facebook/mms-lid-2048" |
|
|
|
processor = AutoFeatureExtractor.from_pretrained(model_id) |
|
model = Wav2Vec2ForSequenceClassification.from_pretrained(model_id) |
|
``` |
|
|
|
Now we process the audio data, pass the processed audio data to the model to classify it into a language, just like we usually do for Wav2Vec2 audio classification models such as [ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition](https://huggingface.co/harshit345/xlsr-wav2vec-speech-emotion-recognition) |
|
|
|
```py |
|
# English |
|
inputs = processor(en_sample, sampling_rate=16_000, return_tensors="pt") |
|
|
|
with torch.no_grad(): |
|
outputs = model(**inputs).logits |
|
|
|
lang_id = torch.argmax(outputs, dim=-1)[0].item() |
|
detected_lang = model.config.id2label[lang_id] |
|
# 'eng' |
|
|
|
# Arabic |
|
inputs = processor(ar_sample, sampling_rate=16_000, return_tensors="pt") |
|
|
|
with torch.no_grad(): |
|
outputs = model(**inputs).logits |
|
|
|
lang_id = torch.argmax(outputs, dim=-1)[0].item() |
|
detected_lang = model.config.id2label[lang_id] |
|
# 'ara' |
|
``` |
|
|
|
To see all the supported languages of a checkpoint, you can print out the language ids as follows: |
|
```py |
|
processor.id2label.values() |
|
``` |
|
|
|
For more details, about the architecture please have a look at [the official docs](https://huggingface.co/docs/transformers/main/en/model_doc/mms). |
|
|
|
## Supported Languages |
|
|
|
This model supports 2048 languages. Unclick the following to toogle all supported languages of this checkpoint in [ISO 639-3 code](https://en.wikipedia.org/wiki/ISO_639-3). |
|
You can find more details about the languages and their ISO 649-3 codes in the [MMS Language Coverage Overview](https://dl.fbaipublicfiles.com/mms/misc/language_coverage_mms.html). |
|
<details> |
|
<summary>Click to toggle</summary> |
|
|
|
- ara |
|
- cmn |
|
- eng |
|
- spa |
|
- fra |
|
- mlg |
|
- swe |
|
- ful |
|
- por |
|
- vie |
|
- sun |
|
- zlm |
|
- ben |
|
- kor |
|
- tuk |
|
- hin |
|
- asm |
|
- ind |
|
- urd |
|
- swh |
|
- aze |
|
- hau |
|
- som |
|
- mon |
|
- tel |
|
- bod |
|
- rus |
|
- tat |
|
- tgl |
|
- slv |
|
- tur |
|
- mar |
|
- heb |
|
- tha |
|
- ron |
|
- yor |
|
- bel |
|
- mal |
|
- cat |
|
- amh |
|
- bul |
|
- hat |
|
- mkd |
|
- pol |
|
- nld |
|
- hun |
|
- tam |
|
- hrv |
|
- fas |
|
- afr |
|
- nya |
|
- cym |
|
- isl |
|
- orm |
|
- kmr |
|
- lin |
|
- jav |
|
- snd |
|
- nob |
|
- uzb |
|
- bos |
|
- deu |
|
- lit |
|
- mya |
|
- lat |
|
- grn |
|
- kaz |
|
- npi |
|
- kik |
|
- ell |
|
- sqi |
|
- yue |
|
- cak |
|
- hye |
|
- kat |
|
- kan |
|
- jpn |
|
- pan |
|
- lav |
|
- guj |
|
- ces |
|
- tgk |
|
- khm |
|
- bak |
|
- ukr |
|
- che |
|
- fao |
|
- mam |
|
- xog |
|
- glg |
|
- ltz |
|
- quc |
|
- aka |
|
- lao |
|
- crh |
|
- sna |
|
- mlt |
|
- poh |
|
- sin |
|
- cfm |
|
- ixl |
|
- aiw |
|
- mri |
|
- tuv |
|
- gag |
|
- pus |
|
- ita |
|
- srp |
|
- lug |
|
- eus |
|
- nno |
|
- nhx |
|
- gur |
|
- ory |
|
- luo |
|
- sxn |
|
- xsm |
|
- cmo |
|
- kbp |
|
- slk |
|
- ewe |
|
- dtp |
|
- fin |
|
- acr |
|
- quy |
|
- saq |
|
- quh |
|
- rif |
|
- bre |
|
- bqc |
|
- tzj |
|
- mos |
|
- bwq |
|
- yao |
|
- cac |
|
- xon |
|
- new |
|
- yid |
|
- hne |
|
- dan |
|
- hus |
|
- dyu |
|
- uig |
|
- pse |
|
- bam |
|
- bus |
|
- ttq |
|
- ngl |
|
- est |
|
- tso |
|
- gng |
|
- seh |
|
- wlx |
|
- sck |
|
- rjs |
|
- ntm |
|
- lok |
|
- tcc |
|
- mup |
|
- dga |
|
- lis |
|
- kru |
|
- cnh |
|
- bxk |
|
- mnk |
|
- amf |
|
- guh |
|
- rmc |
|
- rel |
|
- zne |
|
- teo |
|
- mzi |
|
- tpi |
|
- ycl |
|
- xsr |
|
- ddn |
|
- thl |
|
- wal |
|
- ctg |
|
- onb |
|
- gbo |
|
- vmw |
|
- beh |
|
- mip |
|
- lnd |
|
- khg |
|
- bfz |
|
- ifa |
|
- rol |
|
- nzi |
|
- ceb |
|
- kml |
|
- sxb |
|
- nym |
|
- acn |
|
- bfo |
|
- mhy |
|
- adx |
|
- mqj |
|
- bbc |
|
- pmf |
|
- dsh |
|
- bfy |
|
- sid |
|
- bno |
|
- bfa |
|
- pxm |
|
- sda |
|
- oku |
|
- mbu |
|
- qxl |
|
- ndv |
|
- nmz |
|
- tzh |
|
- box |
|
- iri |
|
- nxq |
|
- ayr |
|
- bgq |
|
- bbo |
|
- gof |
|
- bmq |
|
- kdt |
|
- cla |
|
- asa |
|
- lew |
|
- war |
|
- kfx |
|
- zpu |
|
- xal |
|
- fon |
|
- maj |
|
- mag |
|
- kle |
|
- hlb |
|
- any |
|
- poe |
|
- pil |
|
- rej |
|
- lbw |
|
- bdu |
|
- dgi |
|
- mgo |
|
- mkl |
|
- mco |
|
- maa |
|
- btd |
|
- kcg |
|
- tng |
|
- pls |
|
- kdl |
|
- tzo |
|
- pap |
|
- lns |
|
- kyb |
|
- ksb |
|
- akp |
|
- zar |
|
- gil |
|
- blt |
|
- ctd |
|
- mhx |
|
- gud |
|
- hnn |
|
- kek |
|
- mxt |
|
- frd |
|
- krc |
|
- suz |
|
- ava |
|
- mcp |
|
- hyw |
|
- hlt |
|
- dnw |
|
- udm |
|
- xed |
|
- kpv |
|
- bkd |
|
- xnj |
|
- atb |
|
- cwe |
|
- nog |
|
- kij |
|
- mqn |
|
- btx |
|
- ife |
|
- bgw |
|
- trs |
|
- kjh |
|
- chv |
|
- ati |
|
- ybb |
|
- did |
|
- gau |
|
- dnj |
|
- kbo |
|
- cle |
|
- crs |
|
- nhy |
|
- yba |
|
- zpz |
|
- yka |
|
- dgk |
|
- mgd |
|
- lon |
|
- cab |
|
- muy |
|
- taq |
|
- tlj |
|
- sne |
|
- smo |
|
- nsu |
|
- nin |
|
- cnl |
|
- btt |
|
- tly |
|
- mge |
|
- prk |
|
- ium |
|
- zpt |
|
- eka |
|
- mfk |
|
- akb |
|
- mxb |
|
- cso |
|
- kak |
|
- yre |
|
- obo |
|
- tgj |
|
- abi |
|
- yas |
|
- men |
|
- nga |
|
- blh |
|
- kdc |
|
- cmr |
|
- bom |
|
- zpg |
|
- yea |
|
- ubl |
|
- hwc |
|
- xtm |
|
- mhr |
|
- avn |
|
- log |
|
- xsb |
|
- kri |
|
- idd |
|
- mnw |
|
- plw |
|
- nuj |
|
- ted |
|
- sbp |
|
- knb |
|
- kwf |
|
- rkt |
|
- mib |
|
- miy |
|
- lsi |
|
- zaj |
|
- mih |
|
- myv |
|
- luc |
|
- tob |
|
- mpm |
|
- kne |
|
- asg |
|
- pps |
|
- flr |
|
- trn |
|
- xmm |
|
- poi |
|
- qxr |
|
- zmz |
|
- kqe |
|
- sjm |
|
- kmd |
|
- mim |
|
- knj |
|
- gqr |
|
- suc |
|
- med |
|
- tbl |
|
- mto |
|
- kzf |
|
- bdh |
|
- zpc |
|
- hoc |
|
- krs |
|
- snp |
|
- wsg |
|
- zaq |
|
- gwr |
|
- yaz |
|
- cgc |
|
- azg |
|
- sil |
|
- mil |
|
- kir |
|
- dav |
|
- xtd |
|
- pis |
|
- qvh |
|
- mai |
|
- prt |
|
- tlb |
|
- kin |
|
- ami |
|
- cok |
|
- san |
|
- lif |
|
- atq |
|
- iba |
|
- knk |
|
- rub |
|
- zga |
|
- jun |
|
- yal |
|
- run |
|
- tye |
|
- ngu |
|
- nij |
|
- pkb |
|
- gux |
|
- dig |
|
- gog |
|
- gbm |
|
- nhe |
|
- hnj |
|
- ubu |
|
- nyy |
|
- tir |
|
- kdj |
|
- awa |
|
- bcc |
|
- sus |
|
- nan |
|
- kno |
|
- nyn |
|
- nyf |
|
- dnt |
|
- grt |
|
- mdy |
|
- hak |
|
- ses |
|
- suk |
|
- bem |
|
- keo |
|
- guk |
|
- lam |
|
- kue |
|
- khq |
|
- kus |
|
- lsm |
|
- bwu |
|
- dug |
|
- sbd |
|
- kdh |
|
- sah |
|
- mur |
|
- shn |
|
- spy |
|
- cko |
|
- aha |
|
- mfz |
|
- rmy |
|
- nim |
|
- gjn |
|
- kde |
|
- bsq |
|
- spp |
|
- kqn |
|
- zyb |
|
- oci |
|
- nnw |
|
- cly |
|
- rim |
|
- oss |
|
- bru |
|
- dag |
|
- ade |
|
- gum |
|
- law |
|
- tem |
|
- kaa |
|
- raw |
|
- kff |
|
- lhu |
|
- taj |
|
- dyo |
|
- hui |
|
- kbr |
|
- mpg |
|
- guc |
|
- niy |
|
- nus |
|
- mzj |
|
- tbz |
|
- bib |
|
- quz |
|
- mev |
|
- ptu |
|
- lef |
|
- mfi |
|
- bky |
|
- mdm |
|
- mgh |
|
- bim |
|
- mnb |
|
- fij |
|
- maw |
|
- dip |
|
- qul |
|
- bgc |
|
- mxv |
|
- thf |
|
- bud |
|
- dzo |
|
- lom |
|
- ztq |
|
- mfq |
|
- ach |
|
- las |
|
- nia |
|
- tbt |
|
- dgo |
|
- zab |
|
- dik |
|
- pbb |
|
- kac |
|
- dop |
|
- pcm |
|
- shk |
|
- xnr |
|
- zpo |
|
- ktb |
|
- bba |
|
- sba |
|
- myb |
|
- quw |
|
- emp |
|
- ctu |
|
- gbk |
|
- guw |
|
- nst |
|
- cnt |
|
- ilo |
|
- cme |
|
- srx |
|
- qvm |
|
- mhi |
|
- mzw |
|
- zao |
|
- set |
|
- csk |
|
- wol |
|
- nnb |
|
- zas |
|
- zaw |
|
- mgq |
|
- yam |
|
- sig |
|
- kam |
|
- biv |
|
- laj |
|
- otq |
|
- pce |
|
- mwv |
|
- mak |
|
- kfb |
|
- alz |
|
- dwr |
|
- hif |
|
- kao |
|
- mor |
|
- lme |
|
- nav |
|
- lob |
|
- cax |
|
- cdj |
|
- knf |
|
- mad |
|
- kfy |
|
- alt |
|
- tgw |
|
- wwa |
|
- ljp |
|
- myk |
|
- sag |
|
- kbq |
|
- jiv |
|
- mxq |
|
- ahk |
|
- kab |
|
- mie |
|
- car |
|
- nfr |
|
- mfe |
|
- cni |
|
- led |
|
- mbb |
|
- twu |
|
- nag |
|
- cya |
|
- kum |
|
- tsz |
|
- cco |
|
- mnf |
|
- nhu |
|
- mzm |
|
- trq |
|
- ken |
|
- ker |
|
- bpr |
|
- cou |
|
- kyq |
|
- xpe |
|
- zpl |
|
- enb |
|
- zad |
|
- bcl |
|
- bex |
|
- sas |
|
- ruf |
|
- srn |
|
- gor |
|
- tik |
|
- xtn |
|
- gmv |
|
- kez |
|
- kss |
|
- old |
|
- nod |
|
- kxm |
|
- lia |
|
- izr |
|
- ozm |
|
- bfd |
|
- acf |
|
- thk |
|
- mah |
|
- sgw |
|
- daa |
|
- ifb |
|
- jmc |
|
- nyo |
|
- myx |
|
- zai |
|
- nhw |
|
- ncu |
|
- nhi |
|
- adj |
|
- wba |
|
- lgg |
|
- irk |
|
- tca |
|
- mjl |
|
- ote |
|
- kpz |
|
- bdq |
|
- jam |
|
- agr |
|
- zpi |
|
- sml |
|
- mvp |
|
- kxc |
|
- bsc |
|
- hay |
|
- dyi |
|
- ilb |
|
- itv |
|
- hil |
|
- bkv |
|
- poy |
|
- cuk |
|
- miz |
|
- kdi |
|
- zpm |
|
- adh |
|
- npl |
|
- mrw |
|
- lee |
|
- bss |
|
- pam |
|
- aaz |
|
- kqy |
|
- key |
|
- cpa |
|
- kkj |
|
- tap |
|
- sbl |
|
- qvw |
|
- yua |
|
- ziw |
|
- xrb |
|
- mcu |
|
- sur |
|
- heh |
|
- lwo |
|
- gej |
|
- ace |
|
- zos |
|
- agd |
|
- bci |
|
- cce |
|
- toc |
|
- mbt |
|
- shi |
|
- tll |
|
- kjb |
|
- toi |
|
- pbi |
|
- ann |
|
- krl |
|
- vmy |
|
- bst |
|
- gkn |
|
- nwb |
|
- pag |
|
- jbu |
|
- klu |
|
- gso |
|
- kyu |
|
- mio |
|
- ngp |
|
- zaa |
|
- eza |
|
- omi |
|
- izz |
|
- loq |
|
- pww |
|
- miq |
|
- min |
|
- cuc |
|
- bav |
|
- bzj |
|
- jac |
|
- gbi |
|
- pko |
|
- dts |
|
- gxx |
|
- haw |
|
- ood |
|
- qxh |
|
- bts |
|
- crn |
|
- krj |
|
- umb |
|
- sgj |
|
- zty |
|
- kki |
|
- qwh |
|
- kub |
|
- ndj |
|
- hns |
|
- chz |
|
- ksp |
|
- qvn |
|
- gde |
|
- mfy |
|
- bjv |
|
- rng |
|
- mif |
|
- wmw |
|
- ndp |
|
- mir |
|
- bps |
|
- jnj |
|
- ifu |
|
- iqw |
|
- djk |
|
- gvl |
|
- kdn |
|
- mzk |
|
- toh |
|
- qxn |
|
- nnq |
|
- rmo |
|
- ncj |
|
- nyu |
|
- mrj |
|
- wob |
|
- ifk |
|
- mog |
|
- hig |
|
- maz |
|
- ban |
|
- srm |
|
- mas |
|
- mda |
|
- nse |
|
- gym |
|
- hno |
|
- bgd |
|
- tac |
|
- bxg |
|
- qvs |
|
- nch |
|
- ibg |
|
- mey |
|
- zae |
|
- neb |
|
- ldi |
|
- qvz |
|
- zca |
|
- jvn |
|
- kwi |
|
- ndz |
|
- mza |
|
- qve |
|
- qvc |
|
- caa |
|
- wbi |
|
- alw |
|
- azz |
|
- tos |
|
- qxo |
|
- ibo |
|
- mkw |
|
- avu |
|
- otn |
|
- stb |
|
- kby |
|
- xho |
|
- bcq |
|
- pae |
|
- lnl |
|
- guz |
|
- ksw |
|
- syl |
|
- tyv |
|
- zul |
|
- lai |
|
- mww |
|
- loz |
|
- beq |
|
- mer |
|
- arn |
|
- bza |
|
- lun |
|
- lbj |
|
- bto |
|
- mnh |
|
- pov |
|
- nbw |
|
- ckb |
|
- epo |
|
- sfw |
|
- knc |
|
- tzm |
|
- top |
|
- lus |
|
- ige |
|
- tum |
|
- gvr |
|
- csh |
|
- xdy |
|
- bho |
|
- abk |
|
- ijc |
|
- nso |
|
- vai |
|
- neq |
|
- gkp |
|
- dje |
|
- bev |
|
- jen |
|
- lub |
|
- ndc |
|
- lrc |
|
- qug |
|
- bax |
|
- bum |
|
- srr |
|
- tiv |
|
- sea |
|
- maf |
|
- pci |
|
- xkl |
|
- rhg |
|
- bft |
|
- ngc |
|
- lua |
|
- kck |
|
- awn |
|
- lag |
|
- ada |
|
- soe |
|
- swk |
|
- mni |
|
- pdt |
|
- ebu |
|
- bwr |
|
- etu |
|
- krw |
|
- gaa |
|
- mkn |
|
- gle |
|
- mug |
|
- kqs |
|
- ida |
|
- kvj |
|
- trc |
|
- zza |
|
- nzb |
|
- mcn |
|
- lol |
|
- lic |
|
- zpq |
|
- skr |
|
- rml |
|
- ggu |
|
- hdy |
|
- ktu |
|
- mgw |
|
- lmp |
|
- mfa |
|
- ijn |
|
- mwm |
|
- vmk |
|
- mua |
|
- ngb |
|
- dur |
|
- nup |
|
- tsc |
|
- bkm |
|
- kpm |
|
- idu |
|
- ksf |
|
- kea |
|
- urh |
|
- mro |
|
- ego |
|
- gya |
|
- kfc |
|
- nnc |
|
- mrt |
|
- ndi |
|
- ogo |
|
- tui |
|
- bhi |
|
- bzw |
|
- elm |
|
- okr |
|
- its |
|
- adi |
|
- kng |
|
- mhw |
|
- mgr |
|
- ast |
|
- igb |
|
- kfi |
|
- dzg |
|
- mzl |
|
- ncl |
|
- kmb |
|
- sat |
|
- unr |
|
- bhb |
|
- glk |
|
- iso |
|
- sef |
|
- bin |
|
- sgc |
|
- coh |
|
- dua |
|
- giz |
|
- tod |
|
- dks |
|
- kaj |
|
- wlo |
|
- ady |
|
- emk |
|
- suj |
|
- lzz |
|
- snf |
|
- tvs |
|
- jra |
|
- zav |
|
- bbj |
|
- mhu |
|
- kel |
|
- njz |
|
- tuy |
|
- efi |
|
- lgm |
|
- lue |
|
- tke |
|
- igl |
|
- nde |
|
- tsn |
|
- gom |
|
- nyd |
|
- trp |
|
- kjl |
|
- haq |
|
- byv |
|
- ven |
|
- fan |
|
- ble |
|
- jmx |
|
- byd |
|
- toq |
|
- bvu |
|
- sdr |
|
- wes |
|
- her |
|
- swb |
|
- bcp |
|
- dde |
|
- haj |
|
- ktz |
|
- qxu |
|
- rmn |
|
- sou |
|
- sot |
|
- rag |
|
- glv |
|
- bjg |
|
- mve |
|
- kha |
|
- mjt |
|
- jmd |
|
- mwn |
|
- wof |
|
- oki |
|
- nnh |
|
- kjc |
|
- sep |
|
- gno |
|
- mix |
|
- trd |
|
- sco |
|
- evn |
|
- brv |
|
- kjg |
|
- tkr |
|
- mfv |
|
- div |
|
- rki |
|
- fmu |
|
- eyo |
|
- aoz |
|
- mhs |
|
- hvn |
|
- chf |
|
- mym |
|
- lbx |
|
- mjx |
|
- mtd |
|
- lrm |
|
- hni |
|
- pmy |
|
- lbm |
|
- akh |
|
- rgs |
|
- lwg |
|
- nuz |
|
- khw |
|
- the |
|
- pof |
|
- wci |
|
- tpe |
|
- bqi |
|
- bjn |
|
- ccp |
|
- cto |
|
- abt |
|
- nos |
|
- tog |
|
- llc |
|
- zac |
|
- tet |
|
- kuj |
|
- tab |
|
- tcz |
|
- zin |
|
- ajg |
|
- bkx |
|
- imo |
|
- iru |
|
- knx |
|
- knu |
|
- nyk |
|
- ymm |
|
- xmc |
|
- bgz |
|
- ina |
|
- mau |
|
- cnk |
|
- loe |
|
- ztg |
|
- esg |
|
- thq |
|
- snk |
|
- nza |
|
- srb |
|
- blo |
|
- otd |
|
- pht |
|
- blr |
|
- scg |
|
- zam |
|
- lla |
|
- xta |
|
- ssy |
|
- rah |
|
- pbo |
|
- ctp |
|
- kpo |
|
- pnb |
|
- mki |
|
- zpv |
|
- bha |
|
- maq |
|
- tth |
|
- eto |
|
- atd |
|
- bhw |
|
- gwn |
|
- phr |
|
- mxx |
|
- mui |
|
- sdq |
|
- xsq |
|
- tkt |
|
- tsj |
|
- uki |
|
- mgp |
|
- mvv |
|
- enq |
|
- bxr |
|
- qxp |
|
- tdt |
|
- olu |
|
- bji |
|
- ton |
|
- knl |
|
- pdu |
|
- pwo |
|
- kei |
|
- zgb |
|
- bug |
|
- sie |
|
- gah |
|
- jml |
|
- kmw |
|
- mrr |
|
- oyb |
|
- ria |
|
- shr |
|
- vah |
|
- djo |
|
- krn |
|
- khb |
|
- tpx |
|
- kas |
|
- hii |
|
- bun |
|
- jab |
|
- hmd |
|
- dhw |
|
- lir |
|
- dhn |
|
- ssw |
|
- iii |
|
- kca |
|
- peg |
|
- agx |
|
- kib |
|
- bap |
|
- brx |
|
- bmb |
|
- nbe |
|
- dar |
|
- anu |
|
- kmc |
|
- ksd |
|
- lep |
|
- zyn |
|
- rwr |
|
- pcc |
|
- hmt |
|
- kxv |
|
- dta |
|
- sdo |
|
- hea |
|
- aso |
|
- lri |
|
- cdm |
|
- mji |
|
- dib |
|
- ewo |
|
- yom |
|
- cch |
|
- kfq |
|
- bzf |
|
- shj |
|
- yiz |
|
- kai |
|
- afe |
|
- ish |
|
- wbr |
|
- kgp |
|
- mrd |
|
- thr |
|
- pmi |
|
- sip |
|
- xtl |
|
- ekg |
|
- ygr |
|
- kwv |
|
- bas |
|
- kfk |
|
- njb |
|
- zzj |
|
- rab |
|
- lot |
|
- bzy |
|
- stt |
|
- afu |
|
- dhd |
|
- mjc |
|
- gol |
|
- twh |
|
- bfb |
|
- tdf |
|
- wbm |
|
- blk |
|
- kge |
|
- swv |
|
- cua |
|
- tpu |
|
- bwx |
|
- kjp |
|
- mgm |
|
- wtm |
|
- xuj |
|
- nbu |
|
- tjg |
|
- les |
|
- gju |
|
- kwl |
|
- cgk |
|
- zpj |
|
- ysn |
|
- haz |
|
- niq |
|
- yig |
|
- sfm |
|
- mtr |
|
- ttr |
|
- wlv |
|
- mfc |
|
- dwz |
|
- sya |
|
- uth |
|
- tes |
|
- lar |
|
- aii |
|
- bde |
|
- say |
|
- hmo |
|
- meu |
|
- shy |
|
- mde |
|
- mke |
|
- tic |
|
- dao |
|
- ywq |
|
- grv |
|
- gjk |
|
- ztp |
|
- mks |
|
- mbz |
|
- tsg |
|
- dob |
|
- lpo |
|
- qud |
|
- gdb |
|
- kbd |
|
- mrg |
|
- xub |
|
- kun |
|
- slr |
|
- ica |
|
- sjp |
|
- tld |
|
- mql |
|
- sif |
|
- uss |
|
- nmf |
|
- soa |
|
- kbl |
|
- bns |
|
- byn |
|
- mdd |
|
- mdr |
|
- tcy |
|
- cnb |
|
- xtc |
|
- tar |
|
- tan |
|
- lbe |
|
- aks |
|
- mjg |
|
- puu |
|
- noe |
|
- kft |
|
- grj |
|
- ruk |
|
- bcs |
|
- msi |
|
- tcu |
|
- sly |
|
- hmr |
|
- lnu |
|
- mlm |
|
- brh |
|
- nbl |
|
- ott |
|
- wbl |
|
- lax |
|
- ort |
|
- hms |
|
- zpa |
|
- juk |
|
- nku |
|
- bge |
|
- rog |
|
- anr |
|
- poc |
|
- prp |
|
- wuu |
|
- gry |
|
- kex |
|
- hsn |
|
- zlj |
|
- kfp |
|
- bca |
|
- aar |
|
- brt |
|
- khr |
|
- swi |
|
- nto |
|
- xkf |
|
- pwr |
|
- tyz |
|
- kua |
|
- bgp |
|
- xwe |
|
- gec |
|
- bli |
|
- lhi |
|
- bww |
|
- hia |
|
- mxy |
|
- msm |
|
- tdd |
|
- roh |
|
- ahr |
|
- lro |
|
- jer |
|
- der |
|
- mng |
|
- apt |
|
- jib |
|
- cta |
|
- zom |
|
- keu |
|
- tyr |
|
- ebo |
|
- anm |
|
- bda |
|
- zyj |
|
- ssb |
|
- bra |
|
- lea |
|
- chq |
|
- nbm |
|
- kad |
|
- ysp |
|
- abs |
|
- esk |
|
- nhp |
|
- bhd |
|
- sce |
|
- bbk |
|
- xkb |
|
- lch |
|
- mdv |
|
- sss |
|
- kvx |
|
- dai |
|
- jio |
|
- hmg |
|
- okv |
|
- zyg |
|
- lmn |
|
- diu |
|
- tcf |
|
- dub |
|
- lkt |
|
- tuz |
|
- kxp |
|
- sgh |
|
- tts |
|
- qvi |
|
- pmj |
|
- duh |
|
- xwl |
|
- lkr |
|
- kif |
|
- koi |
|
- bkr |
|
- zak |
|
- hre |
|
- hmj |
|
- nbr |
|
- vav |
|
- tvd |
|
- yes |
|
- nbc |
|
- ncq |
|
- vas |
|
- bkc |
|
- xbr |
|
- bdv |
|
- lbo |
|
- dcc |
|
- sbx |
|
- ssi |
|
- bqv |
|
- ctl |
|
- scl |
|
- skn |
|
- lez |
|
- tkb |
|
- bdi |
|
- dbm |
|
- buu |
|
- bfr |
|
- yiq |
|
- bew |
|
- cqd |
|
- wew |
|
- bfm |
|
- luj |
|
- mkz |
|
- kgj |
|
- dso |
|
- mse |
|
- doz |
|
- gru |
|
- ich |
|
- mig |
|
- anp |
|
- ayb |
|
- cjk |
|
- wti |
|
- kga |
|
- noi |
|
- ndr |
|
- ldb |
|
- ymk |
|
- gwd |
|
- ktv |
|
- arg |
|
- bjj |
|
- nqg |
|
- fie |
|
- tis |
|
- pca |
|
- bwo |
|
- zdj |
|
- qxs |
|
- bef |
|
- mqu |
|
- nzy |
|
- drg |
|
- kmy |
|
- wja |
|
- arh |
|
- drs |
|
- pll |
|
- jeh |
|
- kwc |
|
- bol |
|
- cdh |
|
- yeu |
|
- tig |
|
- muo |
|
- byc |
|
- nnp |
|
- xty |
|
- kwn |
|
- dio |
|
- gby |
|
- ibb |
|
- mjs |
|
- pua |
|
- sme |
|
- gdf |
|
- otx |
|
- ekr |
|
- aoe |
|
- res |
|
- brf |
|
- vmz |
|
- sbn |
|
- brb |
|
- vmc |
|
- nut |
|
- gas |
|
- mfn |
|
- ywl |
|
- plc |
|
- thz |
|
- mfd |
|
- adl |
|
- bej |
|
- sen |
|
- mgb |
|
- liq |
|
- tpl |
|
- tek |
|
- rin |
|
- chw |
|
- cjm |
|
- mjw |
|
- rnd |
|
- kix |
|
- bsp |
|
- ynq |
|
- ldm |
|
- sym |
|
- amu |
|
- stj |
|
- yrk |
|
- cyo |
|
- isi |
|
- naq |
|
- bau |
|
- bsh |
|
- pbm |
|
- crw |
|
- nja |
|
- dgh |
|
- bdl |
|
- ags |
|
- int |
|
- bpn |
|
- tvu |
|
- mxp |
|
- bsf |
|
- mxs |
|
- twx |
|
- itd |
|
- gel |
|
- hmz |
|
- nma |
|
- pck |
|
- sng |
|
- nlv |
|
- fvr |
|
- blf |
|
- khy |
|
- kfr |
|
- tku |
|
- mgc |
|
- ciw |
|
- rue |
|
- lky |
|
- zln |
|
- tlp |
|
- zkd |
|
- ukw |
|
- tdg |
|
- bhq |
|
- pym |
|
- mlq |
|
- snm |
|
- wni |
|
- mdt |
|
- wlc |
|
- jum |
|
- cde |
|
- kvr |
|
- mus |
|
- tmn |
|
- pmx |
|
- mlf |
|
- btg |
|
- rar |
|
- nri |
|
- osi |
|
- jax |
|
- dsq |
|
- hoj |
|
- pch |
|
- jit |
|
- for |
|
- kgo |
|
- tji |
|
- zpx |
|
- bpy |
|
- wle |
|
- wyy |
|
- cdo |
|
- nbh |
|
- isd |
|
- nhn |
|
- sjo |
|
- kvq |
|
- vmx |
|
- jad |
|
- cdr |
|
- ijj |
|
- bgn |
|
- bcy |
|
- bhh |
|
- qvj |
|
- nix |
|
- xkv |
|
- slp |
|
- kza |
|
- bmi |
|
- rbb |
|
- mck |
|
- rmt |
|
- dox |
|
- kal |
|
- bri |
|
- ets |
|
- ccl |
|
- djm |
|
- nak |
|
- png |
|
- bgs |
|
- pha |
|
- cpx |
|
- nih |
|
- how |
|
- nxd |
|
- hbb |
|
- ior |
|
- mmd |
|
- hrm |
|
- bze |
|
- cov |
|
- bfs |
|
- bfq |
|
- mdj |
|
- mmz |
|
- tkd |
|
- wow |
|
- czt |
|
- iry |
|
- nyi |
|
- ogc |
|
- tvn |
|
- mzb |
|
- gdl |
|
- cdi |
|
- ktp |
|
- khc |
|
- wbq |
|
- atu |
|
- rir |
|
- mls |
|
- anc |
|
- mmc |
|
- bnx |
|
- goa |
|
- bet |
|
- mfb |
|
- zmb |
|
- btm |
|
- hml |
|
- ikw |
|
- zoc |
|
- afo |
|
- mxa |
|
- mvz |
|
- ccg |
|
- rad |
|
- xom |
|
- ngi |
|
- aug |
|
- skt |
|
- ibl |
|
- pem |
|
- byo |
|
- nka |
|
- akw |
|
- jya |
|
- agc |
|
- njo |
|
- mxl |
|
- hwo |
|
- ged |
|
- aal |
|
- gro |
|
- mdu |
|
- vkl |
|
- mrh |
|
- swj |
|
- bip |
|
- kfh |
|
- mbi |
|
- nbi |
|
- gra |
|
- zpn |
|
- jog |
|
- pnz |
|
- nxg |
|
- sse |
|
- njm |
|
- rkm |
|
- bjt |
|
- mgg |
|
- cbk |
|
- prx |
|
- bil |
|
- mkf |
|
- nba |
|
- ddg |
|
- pow |
|
- abr |
|
- ver |
|
- caq |
|
- mgi |
|
- trf |
|
- sed |
|
- cvn |
|
- nbv |
|
- hnd |
|
- liw |
|
- max |
|
- sad |
|
- hav |
|
- ntk |
|
- kxx |
|
- klg |
|
- bhp |
|
- dri |
|
- kny |
|
- bag |
|
- zts |
|
- pwn |
|
- yer |
|
- daq |
|
- kfo |
|
- org |
|
- gvf |
|
- xkg |
|
- yif |
|
- tfi |
|
- chr |
|
- bje |
|
- sez |
|
- zag |
|
- kfa |
|
- mut |
|
- mta |
|
- cld |
|
- kjs |
|
- buo |
|
- opa |
|
- hac |
|
- mqg |
|
- gmz |
|
- glw |
|
- mqx |
|
- wgi |
|
- czh |
|
- diw |
|
- bdm |
|
- bbu |
|
- ahg |
|
- sop |
|
- gqa |
|
- nmc |
|
- nap |
|
- ndo |
|
- gcf |
|
- gbr |
|
- she |
|
- bxb |
|
- kqo |
|
- yun |
|
- mfm |
|
- ryu |
|
- kfm |
|
- bvm |
|
- gow |
|
- jgk |
|
- odk |
|
- syb |
|
- ggg |
|
- yix |
|
- sbk |
|
- slx |
|
- iyx |
|
- vmm |
|
- mbd |
|
- sxw |
|
- gew |
|
- xmg |
|
- tru |
|
- lse |
|
- tay |
|
- wji |
|
- jns |
|
- kyk |
|
- mfo |
|
- kdq |
|
- kfz |
|
- aqg |
|
- iti |
|
- wem |
|
- ghl |
|
- uuu |
|
- itt |
|
- zaf |
|
- mqh |
|
- xti |
|
- ots |
|
- dtm |
|
- yaf |
|
- tsw |
|
- mtu |
|
- gdx |
|
- smy |
|
- nzm |
|
- anw |
|
- adz |
|
- ank |
|
- tuq |
|
- otm |
|
- kip |
|
- hch |
|
- src |
|
- xnz |
|
- sti |
|
- ebr |
|
- wss |
|
- sct |
|
- vmp |
|
- sdh |
|
- vls |
|
- rwk |
|
- dbd |
|
- meh |
|
- kmk |
|
- tma |
|
- bux |
|
- bvi |
|
- ala |
|
- ahs |
|
- mhk |
|
- gid |
|
- yns |
|
- kzc |
|
- mku |
|
- whg |
|
- akl |
|
- bqx |
|
- iko |
|
- krh |
|
- bcz |
|
- dkx |
|
- zpr |
|
- mii |
|
- yim |
|
- mne |
|
- tny |
|
- saz |
|
- zrg |
|
- gab |
|
- ttj |
|
- ckl |
|
- dak |
|
- pdc |
|
- ogb |
|
- bni |
|
- rcf |
|
- nhg |
|
- ike |
|
- snq |
|
- bja |
|
- kot |
|
- kqk |
|
- orx |
|
- fay |
|
- tiy |
|
- pmm |
|
- epi |
|
- hol |
|
- bif |
|
- ilp |
|
- pbv |
|
- trv |
|
- lrl |
|
- nph |
|
- sgd |
|
- scn |
|
- mtb |
|
- tou |
|
- bez |
|
- cgg |
|
- yax |
|
- hgm |
|
- cte |
|
- akf |
|
- mdn |
|
- bzx |
|
- pcl |
|
- sgr |
|
- mdh |
|
- wbj |
|
- ctz |
|
- nsa |
|
- buf |
|
- lna |
|
- gcr |
|
- njh |
|
- shc |
|
- iby |
|
- toj |
|
- pac |
|
- ifm |
|
- gul |
|
- xmf |
|
- sev |
|
- cos |
|
- ngz |
|
- nyw |
|
- plv |
|
- ity |
|
- qus |
|
- zpy |
|
- mkb |
|
- mye |
|
- nre |
|
- bsy |
|
- ksv |
|
- ekp |
|
- agb |
|
- dis |
|
- kjt |
|
- bou |
|
- mwe |
|
- lki |
|
- luz |
|
- nlj |
|
- kkh |
|
- aba |
|
- mbf |
|
- pfe |
|
- ijs |
|
- abu |
|
- tsa |
|
- nyj |
|
- pos |
|
- nkw |
|
- brl |
|
- kmz |
|
- lik |
|
- stv |
|
- knn |
|
- tkq |
|
- yog |
|
- mtq |
|
- tdc |
|
- bgi |
|
- yhd |
|
- ema |
|
- daw |
|
- mnp |
|
- chk |
|
- zmq |
|
- aee |
|
- zoh |
|
- lum |
|
- nds |
|
- bnn |
|
- soz |
|
- oyd |
|
- tul |
|
- gla |
|
- bjo |
|
- bar |
|
- unx |
|
- bks |
|
- moy |
|
- axk |
|
- mzn |
|
- mbs |
|
- puo |
|
- lal |
|
- plk |
|
- ral |
|
- zmp |
|
- jaf |
|
- ivv |
|
- ndh |
|
- oks |
|
- mzv |
|
- lad |
|
- mdw |
|
- cja |
|
- diz |
|
- psi |
|
- bgx |
|
- pon |
|
- sro |
|
- gad |
|
- blm |
|
- kfu |
|
- zpw |
|
- etx |
|
- end |
|
- sby |
|
- msk |
|
- nkh |
|
- gsw |
|
- chj |
|
- mbo |
|
- jge |
|
- vmj |
|
- tft |
|
- cma |
|
- zpe |
|
- zpd |
|
- har |
|
- fry |
|
- gbv |
|
- clu |
|
- bta |
|
- wbk |
|
- nzk |
|
- psh |
|
- zat |
|
- ngj |
|
- agi |
|
- suq |
|
- djc |
|
|
|
</details> |
|
|
|
## Model details |
|
|
|
- **Developed by:** Vineel Pratap et al. |
|
- **Model type:** Multi-Lingual Automatic Speech Recognition model |
|
- **Language(s):** 2048 languages, see [supported languages](#supported-languages) |
|
- **License:** CC-BY-NC 4.0 license |
|
- **Num parameters**: 1 billion |
|
- **Audio sampling rate**: 16,000 kHz |
|
- **Cite as:** |
|
|
|
@article{pratap2023mms, |
|
title={Scaling Speech Technology to 1,000+ Languages}, |
|
author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli}, |
|
journal={arXiv}, |
|
year={2023} |
|
} |
|
|
|
## Additional Links |
|
|
|
- [Blog post](https://ai.facebook.com/blog/multilingual-model-speech-recognition/) |
|
- [Transformers documentation](https://huggingface.co/docs/transformers/main/en/model_doc/mms). |
|
- [Paper](https://arxiv.org/abs/2305.13516) |
|
- [GitHub Repository](https://github.com/facebookresearch/fairseq/tree/main/examples/mms#asr) |
|
- [Other **MMS** checkpoints](https://huggingface.co/models?other=mms) |
|
- MMS base checkpoints: |
|
- [facebook/mms-1b](https://huggingface.co/facebook/mms-1b) |
|
- [facebook/mms-300m](https://huggingface.co/facebook/mms-300m) |
|
- [Official Space](https://huggingface.co/spaces/facebook/MMS) |
|
|