speechbrainteam
commited on
Commit
·
56fc82e
1
Parent(s):
8d05a34
Update README.md
Browse files
README.md
CHANGED
@@ -33,9 +33,9 @@ The performance of the model is the following:
|
|
33 |
## Pipeline description
|
34 |
|
35 |
This ASR system is composed of 2 different but linked blocks:
|
36 |
-
|
37 |
the train transcriptions (train.tsv) of CommonVoice (IT).
|
38 |
-
|
39 |
N blocks of convolutional neural networks with normalization and pooling on the
|
40 |
frequency domain. Then, a bidirectional LSTM is connected to a final DNN to obtain
|
41 |
the final acoustic representation that is given to the CTC and attention decoders.
|
@@ -75,7 +75,7 @@ The SpeechBrain team does not provide any warranty on the performance achieved b
|
|
75 |
year = {2021},
|
76 |
publisher = {GitHub},
|
77 |
journal = {GitHub repository},
|
78 |
-
howpublished = {
|
79 |
}
|
80 |
```
|
81 |
|
|
|
33 |
## Pipeline description
|
34 |
|
35 |
This ASR system is composed of 2 different but linked blocks:
|
36 |
+
- Tokenizer (unigram) that transforms words into subword units and trained with
|
37 |
the train transcriptions (train.tsv) of CommonVoice (IT).
|
38 |
+
- Acoustic model (CRDNN + CTC/Attention). The CRDNN architecture is made of
|
39 |
N blocks of convolutional neural networks with normalization and pooling on the
|
40 |
frequency domain. Then, a bidirectional LSTM is connected to a final DNN to obtain
|
41 |
the final acoustic representation that is given to the CTC and attention decoders.
|
|
|
75 |
year = {2021},
|
76 |
publisher = {GitHub},
|
77 |
journal = {GitHub repository},
|
78 |
+
howpublished = {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\url{https://github.com/speechbrain/speechbrain}},
|
79 |
}
|
80 |
```
|
81 |
|