PyTorch
ONNX
English
Catalan
wetdog commited on
Commit
82b8e88
·
verified ·
1 Parent(s): d6ec671

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -9
README.md CHANGED
@@ -3,6 +3,13 @@ license: apache-2.0
3
  language:
4
  - en
5
  - ca
 
 
 
 
 
 
 
6
  ---
7
 
8
  # Wavenext-encodec
@@ -84,7 +91,7 @@ The model was trained on 4 speech datasets
84
  ### Training Procedure
85
 
86
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
- The model was trained for 1M steps and 183 epochs with a batch size of 16 for stability. We used a Cosine scheduler with a initial learning rate of 1e-4.
88
 
89
 
90
  #### Training Hyperparameters
@@ -101,15 +108,15 @@ The model was trained for 1M steps and 183 epochs with a batch size of 16 for st
101
 
102
  <!-- This section describes the evaluation protocols and provides the results. -->
103
 
104
- Evaluation was done using the metrics on the original repo, after 183 epochs we achieve:
105
 
106
- * val_loss: 3.79
107
- * f1_score: 0.94
108
- * mel_loss: 0.27
109
- * periodicity_loss:0.128
110
- * pesq_score: 3.27
111
- * pitch_loss: 31.33
112
- * utmos_score: 3.20
113
 
114
 
115
  ## Citation
 
3
  language:
4
  - en
5
  - ca
6
+ datasets:
7
+ - mythicinfinity/libritts_r
8
+ - projecte-aina/festcat_trimmed_denoised
9
+ - projecte-aina/openslr-slr69-ca-trimmed-denoised
10
+ - keithito/lj_speech
11
+ base_model:
12
+ - facebook/encodec_24khz
13
  ---
14
 
15
  # Wavenext-encodec
 
91
  ### Training Procedure
92
 
93
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
94
+ The model was trained for 1M steps and 99 epochs with a batch size of 16 for stability. We used a Cosine scheduler with a initial learning rate of 1e-4.
95
 
96
 
97
  #### Training Hyperparameters
 
108
 
109
  <!-- This section describes the evaluation protocols and provides the results. -->
110
 
111
+ Evaluation was done using the metrics on the original vocos repo, Note that this metrics are calculated using the codecs corresponding to a bandwidth of 1.5 kbps, after 99 epochs we achieve:
112
 
113
+ * val_loss: 5.52
114
+ * f1_score: 0.93
115
+ * mel_loss: 0.53
116
+ * periodicity_loss:0.14
117
+ * pesq_score: 2.12
118
+ * pitch_loss: 47.73
119
+ * utmos_score: 2.89
120
 
121
 
122
  ## Citation