suzii
/

vi-whisper-large-v3-turbo-v1

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

suzii commited on Jan 9

Commit

1e9075e

·

verified ·

1 Parent(s): 5128dd8

Update README.md

Files changed (1) hide show

README.md +2 -41

README.md CHANGED Viewed

@@ -7,56 +7,17 @@ This project involves fine-tuning the Whisper-V3-Turbo model to improve its perf
 The training data comes from various Vietnamese speech corpora. Below is a list of datasets used for training:
 1. **capleaf/viVoice**
-   - Path: `capleaf/viVoice`
-   - Mode: `3`
-   - Split: `train`
 2. **NhutP/VSV-1100**
-   - Path: `NhutP/VSV-1100`
-   - Mode: `1`
-   - Split: `train`
 3. **doof-ferb/fpt_fosd**
-   - Path: `doof-ferb/fpt_fosd`
-   - Mode: `0`
-   - Split: `train`
 4. **doof-ferb/infore1_25hours**
-   - Path: `doof-ferb/infore1_25hours`
-   - Mode: `0`
-   - Split: `train`
 5. **google/fleurs (vi_vn)**
-   - Path: `google/fleurs`
-   - Name: `vi_vn`
-   - Mode: `1`
-   - Split: `train`
 6. **doof-ferb/LSVSC**
-   - Path: `doof-ferb/LSVSC`
-   - Mode: `1`
-   - Split: `train`
 7. **quocanh34/viet_vlsp**
-   - Path: `quocanh34/viet_vlsp`
-   - Mode: `0`
-   - Split: `train`
 8. **linhtran92/viet_youtube_asr_corpus_v2**
-   - Path: `linhtran92/viet_youtube_asr_corpus_v2`
-   - Mode: `1`
-   - Split: `train`
 9. **doof-ferb/infore2_audiobooks**
-   - Path: `doof-ferb/infore2_audiobooks`
-   - Mode: `0`
-   - Split: `train`
-10. **linhtran92/viet_bud500**
-    - Path: `linhtran92/viet_bud500`
-    - Mode: `0`
-    - Split: `train`
 ## Model
 The model used in this project is the **Whisper-V3-Turbo**. Whisper is a multilingual ASR model trained on a large and diverse dataset. The version used here has been fine-tuned specifically for the Vietnamese language.

 The training data comes from various Vietnamese speech corpora. Below is a list of datasets used for training:
 1. **capleaf/viVoice**
 2. **NhutP/VSV-1100**
 3. **doof-ferb/fpt_fosd**
 4. **doof-ferb/infore1_25hours**
 5. **google/fleurs (vi_vn)**
 6. **doof-ferb/LSVSC**
 7. **quocanh34/viet_vlsp**
 8. **linhtran92/viet_youtube_asr_corpus_v2**
 9. **doof-ferb/infore2_audiobooks**
+10. **linhtran92/viet_bud500**
+11.
 ## Model
 The model used in this project is the **Whisper-V3-Turbo**. Whisper is a multilingual ASR model trained on a large and diverse dataset. The version used here has been fine-tuned specifically for the Vietnamese language.