BSC-LT
/

wavenext-encodec

Model card Files Files and versions Community

wetdog commited on Sep 12, 2024

Commit

82b8e88

·

verified ·

1 Parent(s): d6ec671

Update README.md

Files changed (1) hide show

README.md +16 -9

README.md CHANGED Viewed

@@ -3,6 +3,13 @@ license: apache-2.0
 language:
 - en
 - ca
 ---
 # Wavenext-encodec
@@ -84,7 +91,7 @@ The model was trained on 4 speech datasets
 ### Training Procedure
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-The model was trained for 1M steps and 183 epochs with a batch size of 16 for stability. We used a Cosine scheduler with a initial learning rate of 1e-4.
 #### Training Hyperparameters
@@ -101,15 +108,15 @@ The model was trained for 1M steps and 183 epochs with a batch size of 16 for st
 <!-- This section describes the evaluation protocols and provides the results. -->
-Evaluation was done using the metrics on the original repo, after 183 epochs we achieve:
-* val_loss: 3.79
-* f1_score: 0.94
-* mel_loss: 0.27
-* periodicity_loss:0.128
-* pesq_score: 3.27
-* pitch_loss: 31.33
-* utmos_score: 3.20
 ## Citation

 language:
 - en
 - ca
+datasets:
+- mythicinfinity/libritts_r
+- projecte-aina/festcat_trimmed_denoised
+- projecte-aina/openslr-slr69-ca-trimmed-denoised
+- keithito/lj_speech
+base_model:
+- facebook/encodec_24khz
 ---
 # Wavenext-encodec
 ### Training Procedure
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+The model was trained for 1M steps and 99 epochs with a batch size of 16 for stability. We used a Cosine scheduler with a initial learning rate of 1e-4.
 #### Training Hyperparameters
 <!-- This section describes the evaluation protocols and provides the results. -->
+Evaluation was done using the metrics on the original vocos repo, Note that this metrics are calculated using the codecs corresponding to a bandwidth of 1.5 kbps, after 99 epochs we achieve:
+* val_loss: 5.52
+* f1_score: 0.93
+* mel_loss: 0.53
+* periodicity_loss:0.14
+* pesq_score: 2.12
+* pitch_loss: 47.73
+* utmos_score: 2.89
 ## Citation