File size: 808 Bytes
d961bd0
 
 
 
 
 
 
 
 
c2af6a8
 
d961bd0
 
 
 
c2af6a8
 
 
d961bd0
c2af6a8
 
 
d961bd0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
datasets:
- mozilla-foundation/common_voice_16_1
language:
- ta
metrics:
- wer
pipeline_tag: automatic-speech-recognition
---
This model is fine-tuned on the Tamil dataset from Common Voice 16.1, preprocessed using Epitran for transliterating text into IPA. The 'tam-Taml' code was employed to generate a precise phoneme list, crucial for capturing the nuances of Tamil phonetics:

* Vowels:
  * Monophthongs:'a', 'aː', 'e', 'eː', 'i', 'iː', 'o', 'oː', 'u', 'uː'
  * Diphthongs: 'aj', 'aʋ'
 
* Consonants:
  * Nasals: 'm', 'n', 'n̪', 'ŋ', 'ɲ', 'ɳ'
  * Stops: 'p', 't̪', 'ʈ', 'k', 
  * Affricates:  't͡ʃ', 'd͡ʒ'
  * Fricatives: 'ʋ', 's', 'ʂ', 'ʃ', 'h'
  * Approximants: 'j', 'ɻ', 'ɾ', 'l', 'ɭ'
  * Consonant cluster: 'kʂ'
* Special Symbols: '்' (denotes absence of inherent vowel)