w11wo commited on Mar 7, 2024

Commit

d4ce303

1 Parent(s): f2b1dfb

Added MOdel

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

README.md +195 -0
data/lang_phone/L.pt +3 -0
data/lang_phone/L_disambig.pt +3 -0
data/lang_phone/Linv.pt +3 -0
data/lang_phone/lexicon.txt +37 -0
data/lang_phone/lexicon_disambig.txt +37 -0
data/lang_phone/tokens.txt +39 -0
data/lang_phone/words.txt +41 -0
exp-causal/ctc-decoding/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt +0 -0
exp-causal/ctc-decoding/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt +0 -0
exp-causal/ctc-decoding/log-decode-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model-2024-03-07-10-57-38 +7 -0
exp-causal/ctc-decoding/log-decode-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model-2024-03-07-10-59-38 +37 -0
exp-causal/ctc-decoding/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt +0 -0
exp-causal/ctc-decoding/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt +0 -0
exp-causal/ctc-decoding/wer-summary-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt +2 -0
exp-causal/ctc-decoding/wer-summary-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt +2 -0
exp-causal/fast_beam_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt +0 -0
exp-causal/fast_beam_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt +0 -0
exp-causal/fast_beam_search/log-decode-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model-2024-03-07-08-39-05 +66 -0
exp-causal/fast_beam_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt +0 -0
exp-causal/fast_beam_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt +0 -0
exp-causal/fast_beam_search/wer-summary-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt +2 -0
exp-causal/fast_beam_search/wer-summary-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt +2 -0
exp-causal/greedy_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt +0 -0
exp-causal/greedy_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt +0 -0
exp-causal/greedy_search/log-decode-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model-2024-03-07-08-38-15 +46 -0
exp-causal/greedy_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt +0 -0
exp-causal/greedy_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt +0 -0
exp-causal/greedy_search/wer-summary-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt +2 -0
exp-causal/greedy_search/wer-summary-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt +2 -0
exp-causal/jit_script_chunk_32_left_128.pt +3 -0
exp-causal/modified_beam_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt +0 -0
exp-causal/modified_beam_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt +0 -0
exp-causal/modified_beam_search/log-decode-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model-2024-03-07-08-41-07 +56 -0
exp-causal/modified_beam_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt +0 -0
exp-causal/modified_beam_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt +0 -0
exp-causal/modified_beam_search/wer-summary-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt +2 -0
exp-causal/modified_beam_search/wer-summary-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt +2 -0
exp-causal/pretrained.pt +3 -0
exp-causal/streaming/fast_beam_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt +0 -0
exp-causal/streaming/fast_beam_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt +0 -0
exp-causal/streaming/fast_beam_search/log-decode-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model-2024-03-07-08-56-46 +154 -0
exp-causal/streaming/fast_beam_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt +0 -0
exp-causal/streaming/fast_beam_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt +0 -0
exp-causal/streaming/fast_beam_search/wer-summary-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt +2 -0
exp-causal/streaming/fast_beam_search/wer-summary-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt +2 -0
exp-causal/streaming/greedy_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt +0 -0
exp-causal/streaming/greedy_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt +0 -0
exp-causal/streaming/greedy_search/log-decode-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model-2024-03-07-08-54-56 +154 -0
exp-causal/streaming/greedy_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt +0 -0

README.md CHANGED Viewed

@@ -1,3 +1,198 @@
 ---
 license: apache-2.0
 ---

 ---
+language: sw
 license: apache-2.0
+tags:
+  - icefall
+  - phoneme-recognition
+  - automatic-speech-recognition
+datasets:
+  - bookbot/ALFFA_swahili
+  - bookbot/fleurs_sw
+  - bookbot/common_voice_16_1_sw
 ---
+# Pruned Stateless Zipformer RNN-T Streaming Robust SW
+Pruned Stateless Zipformer RNN-T Streaming Robust SW is an automatic speech recognition model trained on the following datasets:
+- [ALFFA Swahili](https://huggingface.co/datasets/bookbot/ALFFA_swahili)
+- [FLEURS Swahili](https://huggingface.co/datasets/bookbot/fleurs_sw)
+- [Common Voice 16.1 Swahili](https://huggingface.co/datasets/bookbot/common_voice_16_1_sw)
+Instead of being trained to predict sequences of words, this model was trained to predict sequence of phonemes, e.g. `["w", "ɑ", "ʃ", "i", "ɑ"]`. Therefore, the model's [vocabulary](https://huggingface.co/bookbot/zipformer-streaming-robust-sw/blob/main/data/lang_phone/tokens.txt) contains the different IPA phonemes found in [gruut](https://github.com/rhasspy/gruut).
+This model was trained using [icefall](https://github.com/k2-fsa/icefall) framework. All training was done on a Scaleway RENDER-S VM with a NVIDIA H100 GPU. All necessary scripts used for training could be found in the [Files and versions](https://huggingface.co/bookbot/zipformer-streaming-robust-sw/tree/main) tab, as well as the [Training metrics](https://huggingface.co/bookbot/zipformer-streaming-robust-sw/tensorboard) logged via Tensorboard.
+## Evaluation Results
+### Simulated Streaming
+```sh
+for m in greedy_search fast_beam_search modified_beam_search; do
+  ./zipformer/decode.py \
+    --epoch 40 \
+    --avg 7 \
+    --causal 1 \
+    --chunk-size 32 \
+    --left-context-frames 128 \
+    --exp-dir zipformer/exp-causal \
+    --use-transducer True --use-ctc True \
+    --decoding-method $m
+done
+```
+```sh
+./zipformer/ctc_decode.py \
+    --epoch 40 \
+    --avg 7 \
+    --causal 1 \
+    --chunk-size 32 \
+    --left-context-frames 128 \
+    --exp-dir zipformer/exp-causal \
+    --decoding-method ctc-decoding \
+    --use-transducer True --use-ctc True
+```
+The model achieves the following phoneme error rates on the different test sets:
+| Decoding             | Common Voice 16.1 | FLEURS |
+| -------------------- | :---------------: | :----: |
+| Greedy Search        |       7.71        |  6.58  |
+| Modified Beam Search |       7.53        |  6.4   |
+| Fast Beam Search     |       7.73        |  6.61  |
+| CTC Greedy Search    |       7.78        |  6.72  |
+### Chunk-wise Streaming
+```sh
+for m in greedy_search fast_beam_search modified_beam_search; do
+  ./zipformer/streaming_decode.py \
+    --epoch 40 \
+    --avg 7 \
+    --causal 1 \
+    --chunk-size 32 \
+    --left-context-frames 128 \
+    --exp-dir zipformer/exp-causal \
+    --use-transducer True --use-ctc True \
+    --decoding-method $m \
+    --num-decode-streams 1000
+done
+```
+The model achieves the following phoneme error rates on the different test sets:
+| Decoding             | Common Voice 16.1 | FLEURS |
+| -------------------- | :---------------: | :----: |
+| Greedy Search        |       7.75        |  6.59  |
+| Modified Beam Search |       7.57        |  6.37  |
+| Fast Beam Search     |       7.72        |  6.44  |
+## Usage
+### Download Pre-trained Model
+```sh
+cd egs/bookbot_sw/ASR
+mkdir tmp
+cd tmp
+git lfs install
+git clone https://huggingface.co/bookbot/zipformer-streaming-robust-sw/
+```
+### Inference
+To decode with greedy search, run:
+```sh
+./zipformer/jit_pretrained_streaming.py \
+  --nn-model-filename ./tmp/zipformer-streaming-robust-sw/exp-causal/jit_script_chunk_32_left_128.pt \
+  --tokens ./tmp/zipformer-streaming-robust-sw/data/lang_phone/tokens.txt \
+  ./tmp/zipformer-streaming-robust-sw/test_waves/sample1.wav
+```
+<details>
+<summary>Decoding Output</summary>
+```
+2024-03-07 11:07:41,231 INFO [jit_pretrained_streaming.py:184] device: cuda:0
+2024-03-07 11:07:41,865 INFO [jit_pretrained_streaming.py:197] Constructing Fbank computer
+2024-03-07 11:07:41,866 INFO [jit_pretrained_streaming.py:200] Reading sound files: ./tmp/zipformer-streaming-robust-sw/test_waves/sample1.wav
+2024-03-07 11:07:41,866 INFO [jit_pretrained_streaming.py:205] torch.Size([125568])
+2024-03-07 11:07:41,866 INFO [jit_pretrained_streaming.py:207] Decoding started
+2024-03-07 11:07:41,866 INFO [jit_pretrained_streaming.py:212] chunk_length: 64
+2024-03-07 11:07:41,866 INFO [jit_pretrained_streaming.py:213] T: 77
+2024-03-07 11:07:41,876 INFO [jit_pretrained_streaming.py:229] 0/130368
+2024-03-07 11:07:41,877 INFO [jit_pretrained_streaming.py:229] 4000/130368
+2024-03-07 11:07:41,878 INFO [jit_pretrained_streaming.py:229] 8000/130368
+2024-03-07 11:07:41,879 INFO [jit_pretrained_streaming.py:229] 12000/130368
+2024-03-07 11:07:42,103 INFO [jit_pretrained_streaming.py:229] 16000/130368
+2024-03-07 11:07:42,104 INFO [jit_pretrained_streaming.py:229] 20000/130368
+2024-03-07 11:07:42,126 INFO [jit_pretrained_streaming.py:229] 24000/130368
+2024-03-07 11:07:42,127 INFO [jit_pretrained_streaming.py:229] 28000/130368
+2024-03-07 11:07:42,128 INFO [jit_pretrained_streaming.py:229] 32000/130368
+2024-03-07 11:07:42,151 INFO [jit_pretrained_streaming.py:229] 36000/130368
+2024-03-07 11:07:42,152 INFO [jit_pretrained_streaming.py:229] 40000/130368
+2024-03-07 11:07:42,175 INFO [jit_pretrained_streaming.py:229] 44000/130368
+2024-03-07 11:07:42,176 INFO [jit_pretrained_streaming.py:229] 48000/130368
+2024-03-07 11:07:42,177 INFO [jit_pretrained_streaming.py:229] 52000/130368
+2024-03-07 11:07:42,200 INFO [jit_pretrained_streaming.py:229] 56000/130368
+2024-03-07 11:07:42,201 INFO [jit_pretrained_streaming.py:229] 60000/130368
+2024-03-07 11:07:42,224 INFO [jit_pretrained_streaming.py:229] 64000/130368
+2024-03-07 11:07:42,226 INFO [jit_pretrained_streaming.py:229] 68000/130368
+2024-03-07 11:07:42,226 INFO [jit_pretrained_streaming.py:229] 72000/130368
+2024-03-07 11:07:42,250 INFO [jit_pretrained_streaming.py:229] 76000/130368
+2024-03-07 11:07:42,251 INFO [jit_pretrained_streaming.py:229] 80000/130368
+2024-03-07 11:07:42,252 INFO [jit_pretrained_streaming.py:229] 84000/130368
+2024-03-07 11:07:42,275 INFO [jit_pretrained_streaming.py:229] 88000/130368
+2024-03-07 11:07:42,276 INFO [jit_pretrained_streaming.py:229] 92000/130368
+2024-03-07 11:07:42,299 INFO [jit_pretrained_streaming.py:229] 96000/130368
+2024-03-07 11:07:42,300 INFO [jit_pretrained_streaming.py:229] 100000/130368
+2024-03-07 11:07:42,301 INFO [jit_pretrained_streaming.py:229] 104000/130368
+2024-03-07 11:07:42,325 INFO [jit_pretrained_streaming.py:229] 108000/130368
+2024-03-07 11:07:42,326 INFO [jit_pretrained_streaming.py:229] 112000/130368
+2024-03-07 11:07:42,349 INFO [jit_pretrained_streaming.py:229] 116000/130368
+2024-03-07 11:07:42,350 INFO [jit_pretrained_streaming.py:229] 120000/130368
+2024-03-07 11:07:42,351 INFO [jit_pretrained_streaming.py:229] 124000/130368
+2024-03-07 11:07:42,373 INFO [jit_pretrained_streaming.py:229] 128000/130368
+2024-03-07 11:07:42,374 INFO [jit_pretrained_streaming.py:259] ./tmp/zipformer-streaming-robust-sw/test_waves/sample1.wav
+2024-03-07 11:07:42,374 INFO [jit_pretrained_streaming.py:260] ʃiɑ|ɑᵐɓɑɔ|wɑnɑiʃi|hɑsɑ|kɑtikɑ|ɛnɛɔ|lɑ|mɑʃɑɾiki|kɑtikɑ|ufɑlmɛ|huɔ|wɛnjɛ|utɑʄiɾi|wɑ|mɑfutɑ
+2024-03-07 11:07:42,374 INFO [jit_pretrained_streaming.py:262] Decoding Done
+```
+</details>
+## Training procedure
+### Install icefall
+```sh
+git clone https://github.com/bookbot-hive/icefall
+cd icefall
+export PYTHONPATH=`pwd`:$PYTHONPATH
+```
+### Prepare Data
+```sh
+cd egs/bookbot_sw/ASR
+./prepare.sh
+```
+### Train
+```sh
+export CUDA_VISIBLE_DEVICES="0"
+./zipformer/train.py \
+  --num-epochs 40 \
+  --use-fp16 1 \
+  --exp-dir zipformer/exp-causal \
+  --causal 1 \
+  --max-duration 800 \
+  --use-transducer True --use-ctc True
+```
+## Frameworks
+- [k2](https://github.com/k2-fsa/k2)
+- [icefall](https://github.com/bookbot-hive/icefall)
+- [lhotse](https://github.com/bookbot-hive/lhotse)

data/lang_phone/L.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:521562864ec9620dcf30c713f16614a861b4570d6f633e1c5a006b8743a3a304
+size 1679

data/lang_phone/L_disambig.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a2fb6bfaace3c1d9b8c0472e64a5621422eb0222ec4917875bde509e5ace233a
+size 1715

data/lang_phone/Linv.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:29794d6988b503cfcec0bd6e7dcbe1f0450442c31e162820214429accafaaa3d
+size 1691

data/lang_phone/lexicon.txt ADDED Viewed

	@@ -0,0 +1,37 @@

+f  f
+h  h
+i  i
+j  j
+k  k
+l  l
+m  m
+n  n
+p  p
+s  s
+t  t
+t͡ʃ  t͡ʃ
+u  u
+v  v
+w  w
+x  x
+z  z
+|  |
+ð  ð
+ɑ  ɑ
+ɓ  ɓ
+ɔ  ɔ
+ɗ  ɗ
+ɛ  ɛ
+ɠ  ɠ
+ɣ  ɣ
+ɾ  ɾ
+ʃ  ʃ
+ʄ  ʄ
+θ  θ
+ᵐɓ  ᵐɓ
+ᵑg  ᵑg
+ᶬv  ᶬv
+ⁿz  ⁿz
+ⁿɗ  ⁿɗ
+ⁿɗ͡ʒ  ⁿɗ͡ʒ
+<UNK>  <UNK>

data/lang_phone/lexicon_disambig.txt ADDED Viewed

	@@ -0,0 +1,37 @@

+f f
+h h
+i i
+j j
+k k
+l l
+m m
+n n
+p p
+s s
+t t
+t͡ʃ t͡ʃ
+u u
+v v
+w w
+x x
+z z
+| |
+ð ð
+ɑ ɑ
+ɓ ɓ
+ɔ ɔ
+ɗ ɗ
+ɛ ɛ
+ɠ ɠ
+ɣ ɣ
+ɾ ɾ
+ʃ ʃ
+ʄ ʄ
+θ θ
+ᵐɓ ᵐɓ
+ᵑg ᵑg
+ᶬv ᶬv
+ⁿz ⁿz
+ⁿɗ ⁿɗ
+ⁿɗ͡ʒ ⁿɗ͡ʒ
+<UNK> <UNK>

data/lang_phone/tokens.txt ADDED Viewed

	@@ -0,0 +1,39 @@

+<eps> 0
+s 1
+ð 2
+ᵑg 3
+ᶬv 4
+ʃ 5
+ɔ 6
+x 7
+t 8
+ɛ 9
+v 10
+ⁿɗ͡ʒ 11
+f 12
+n 13
+| 14
+ⁿz 15
+k 16
+h 17
+t͡ʃ 18
+<UNK> 19
+ɗ 20
+z 21
+m 22
+ʄ 23
+ɠ 24
+θ 25
+j 26
+ᵐɓ 27
+u 28
+ɣ 29
+ɓ 30
+i 31
+l 32
+ɾ 33
+ⁿɗ 34
+w 35
+p 36
+ɑ 37
+#0 38

data/lang_phone/words.txt ADDED Viewed

	@@ -0,0 +1,41 @@

+<eps> 0
+<UNK> 1
+f 2
+h 3
+i 4
+j 5
+k 6
+l 7
+m 8
+n 9
+p 10
+s 11
+t 12
+t͡ʃ 13
+u 14
+v 15
+w 16
+x 17
+z 18
+| 19
+ð 20
+ɑ 21
+ɓ 22
+ɔ 23
+ɗ 24
+ɛ 25
+ɠ 26
+ɣ 27
+ɾ 28
+ʃ 29
+ʄ 30
+θ 31
+ᵐɓ 32
+ᵑg 33
+ᶬv 34
+ⁿz 35
+ⁿɗ 36
+ⁿɗ͡ʒ 37
+#0 38
+<s> 39
+</s> 40

exp-causal/ctc-decoding/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/ctc-decoding/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/ctc-decoding/log-decode-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model-2024-03-07-10-57-38 ADDED Viewed

	@@ -0,0 +1,7 @@

+2024-03-07 10:57:38,784 INFO [ctc_decode.py:631] Decoding started
+2024-03-07 10:57:38,784 INFO [ctc_decode.py:637] Device: cuda:0
+2024-03-07 10:57:38,784 INFO [ctc_decode.py:638] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.4', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': 'f6919c0ddb311bea7b53a50f3afdcb3c18b8ccc8', 'k2-git-date': 'Sat Feb 10 09:23:09 2024', 'lhotse-version': '1.22.0.dev+git.9355bd72.clean', 'torch-version': '2.0.0+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.1', 'icefall-git-branch': 'master', 'icefall-git-sha1': 'b35406b0-dirty', 'icefall-git-date': 'Thu Mar 7 06:20:34 2024', 'icefall-path': '/root/icefall', 'k2-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/k2/__init__.py', 'lhotse-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': 'bookbot-h100', 'IP address': '127.0.0.1'}, 'frame_shift_ms': 10, 'search_beam': 20, 'output_beam': 8, 'min_active_states': 30, 'max_active_states': 10000, 'use_double_scores': True, 'epoch': 40, 'iter': 0, 'avg': 7, 'use_averaged_model': True, 'exp_dir': PosixPath('zipformer/exp-causal'), 'lang_dir': PosixPath('data/lang_phone'), 'context_size': 2, 'decoding_method': 'ctc-decoding', 'num_paths': 100, 'nbest_scale': 1.0, 'hlg_scale': 0.6, 'lm_dir': PosixPath('data/lm'), 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': True, 'chunk_size': '32', 'left_context_frames': '128', 'use_transducer': True, 'use_ctc': True, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'res_dir': PosixPath('zipformer/exp-causal/ctc-decoding'), 'suffix': 'epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model'}
+2024-03-07 10:57:38,784 INFO [lexicon.py:168] Loading pre-compiled data/lang_phone/Linv.pt
+2024-03-07 10:57:38,948 INFO [ctc_decode.py:713] About to create model
+2024-03-07 10:57:39,170 INFO [ctc_decode.py:780] Calculating the averaged model over epoch range from 33 (excluded) to 40
+2024-03-07 10:57:39,809 INFO [ctc_decode.py:797] Number of model parameters: 65182863

exp-causal/ctc-decoding/log-decode-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model-2024-03-07-10-59-38 ADDED Viewed

	@@ -0,0 +1,37 @@

+2024-03-07 10:59:38,354 INFO [ctc_decode.py:621] Decoding started
+2024-03-07 10:59:38,354 INFO [ctc_decode.py:627] Device: cuda:0
+2024-03-07 10:59:38,354 INFO [ctc_decode.py:628] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.4', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': 'f6919c0ddb311bea7b53a50f3afdcb3c18b8ccc8', 'k2-git-date': 'Sat Feb 10 09:23:09 2024', 'lhotse-version': '1.22.0.dev+git.9355bd72.clean', 'torch-version': '2.0.0+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.1', 'icefall-git-branch': 'master', 'icefall-git-sha1': 'b35406b0-dirty', 'icefall-git-date': 'Thu Mar 7 06:20:34 2024', 'icefall-path': '/root/icefall', 'k2-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/k2/__init__.py', 'lhotse-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': 'bookbot-h100', 'IP address': '127.0.0.1'}, 'frame_shift_ms': 10, 'search_beam': 20, 'output_beam': 8, 'min_active_states': 30, 'max_active_states': 10000, 'use_double_scores': True, 'epoch': 40, 'iter': 0, 'avg': 7, 'use_averaged_model': True, 'exp_dir': PosixPath('zipformer/exp-causal'), 'lang_dir': PosixPath('data/lang_phone'), 'context_size': 2, 'decoding_method': 'ctc-decoding', 'num_paths': 100, 'nbest_scale': 1.0, 'hlg_scale': 0.6, 'lm_dir': PosixPath('data/lm'), 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': True, 'chunk_size': '32', 'left_context_frames': '128', 'use_transducer': True, 'use_ctc': True, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'res_dir': PosixPath('zipformer/exp-causal/ctc-decoding'), 'suffix': 'epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model'}
+2024-03-07 10:59:38,355 INFO [lexicon.py:168] Loading pre-compiled data/lang_phone/Linv.pt
+2024-03-07 10:59:38,522 INFO [ctc_decode.py:701] About to create model
+2024-03-07 10:59:38,744 INFO [ctc_decode.py:756] Calculating the averaged model over epoch range from 33 (excluded) to 40
+2024-03-07 10:59:39,391 INFO [ctc_decode.py:772] Number of model parameters: 65182863
+2024-03-07 10:59:39,392 INFO [multidataset.py:81] About to get FLEURS test cuts
+2024-03-07 10:59:39,392 INFO [multidataset.py:83] Loading FLEURS in lazy mode
+2024-03-07 10:59:39,392 INFO [multidataset.py:90] About to get Common Voice test cuts
+2024-03-07 10:59:39,392 INFO [multidataset.py:92] Loading Common Voice in lazy mode
+2024-03-07 10:59:39,992 INFO [ctc_decode.py:542] batch 0/?, cuts processed until now is 11
+2024-03-07 10:59:44,584 INFO [ctc_decode.py:556] The transcripts are stored in zipformer/exp-causal/ctc-decoding/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
+2024-03-07 10:59:44,625 INFO [utils.py:656] [test-fleurs-ctc-decoding] %WER 6.72% [4137 / 61587, 1757 ins, 1036 del, 1344 sub ]
+2024-03-07 10:59:44,719 INFO [ctc_decode.py:565] Wrote detailed error stats to zipformer/exp-causal/ctc-decoding/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
+2024-03-07 10:59:44,719 INFO [ctc_decode.py:579]
+For test-fleurs, WER of different settings are:
+ctc-decoding	6.72	best for test-fleurs
+2024-03-07 10:59:45,379 INFO [ctc_decode.py:542] batch 0/?, cuts processed until now is 28
+2024-03-07 10:59:52,644 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.4930, 1.6467, 1.4691, 1.4399, 1.6736, 1.6094, 1.5041, 1.7942],
+       device='cuda:0')
+2024-03-07 10:59:53,068 INFO [ctc_decode.py:542] batch 100/?, cuts processed until now is 3210
+2024-03-07 10:59:57,567 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.6248, 2.7026, 2.6528, 1.9603], device='cuda:0')
+2024-03-07 10:59:58,159 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.1807, 3.0239, 2.3377, 2.7052], device='cuda:0')
+2024-03-07 10:59:58,511 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.7430, 1.7438, 1.6404, 1.8630, 1.9495, 1.9508, 2.0193, 1.6035],
+       device='cuda:0')
+2024-03-07 11:00:00,641 INFO [ctc_decode.py:542] batch 200/?, cuts processed until now is 6582
+2024-03-07 11:00:08,194 INFO [ctc_decode.py:542] batch 300/?, cuts processed until now is 9972
+2024-03-07 11:00:13,903 INFO [ctc_decode.py:556] The transcripts are stored in zipformer/exp-causal/ctc-decoding/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
+2024-03-07 11:00:14,287 INFO [utils.py:656] [test-commonvoice-ctc-decoding] %WER 7.78% [48615 / 624874, 12396 ins, 20104 del, 16115 sub ]
+2024-03-07 11:00:15,129 INFO [ctc_decode.py:565] Wrote detailed error stats to zipformer/exp-causal/ctc-decoding/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
+2024-03-07 11:00:15,130 INFO [ctc_decode.py:579]
+For test-commonvoice, WER of different settings are:
+ctc-decoding	7.78	best for test-commonvoice
+2024-03-07 11:00:15,130 INFO [ctc_decode.py:806] Done!

exp-causal/ctc-decoding/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/ctc-decoding/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/ctc-decoding/wer-summary-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ settings WER
2	+ ctc-decoding 7.78

exp-causal/ctc-decoding/wer-summary-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ settings WER
2	+ ctc-decoding 6.72

exp-causal/fast_beam_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/fast_beam_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/fast_beam_search/log-decode-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model-2024-03-07-08-39-05 ADDED Viewed

	@@ -0,0 +1,66 @@

+2024-03-07 08:39:05,727 INFO [decode.py:764] Decoding started
+2024-03-07 08:39:05,727 INFO [decode.py:770] Device: cuda:0
+2024-03-07 08:39:05,727 INFO [lexicon.py:168] Loading pre-compiled data/lang_phone/Linv.pt
+2024-03-07 08:39:05,728 INFO [decode.py:778] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.4', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': 'f6919c0ddb311bea7b53a50f3afdcb3c18b8ccc8', 'k2-git-date': 'Sat Feb 10 09:23:09 2024', 'lhotse-version': '1.22.0.dev+git.9355bd72.clean', 'torch-version': '2.0.0+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.1', 'icefall-git-branch': 'master', 'icefall-git-sha1': 'b35406b0-clean', 'icefall-git-date': 'Thu Mar 7 06:20:34 2024', 'icefall-path': '/root/icefall', 'k2-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/k2/__init__.py', 'lhotse-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': 'bookbot-h100', 'IP address': '127.0.0.1'}, 'epoch': 40, 'iter': 0, 'avg': 7, 'use_averaged_model': True, 'exp_dir': PosixPath('zipformer/exp-causal'), 'lang_dir': PosixPath('data/lang_phone'), 'decoding_method': 'fast_beam_search', 'beam_size': 4, 'beam': 20.0, 'ngram_lm_scale': 0.01, 'max_contexts': 8, 'max_states': 64, 'context_size': 2, 'max_sym_per_frame': 1, 'num_paths': 200, 'nbest_scale': 0.5, 'use_shallow_fusion': False, 'lm_type': 'rnn', 'lm_scale': 0.3, 'tokens_ngram': 3, 'backoff_id': 500, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': True, 'chunk_size': '32', 'left_context_frames': '128', 'use_transducer': True, 'use_ctc': True, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'lm_vocab_size': 500, 'lm_epoch': 7, 'lm_avg': 1, 'lm_exp_dir': None, 'rnn_lm_embedding_dim': 2048, 'rnn_lm_hidden_dim': 2048, 'rnn_lm_num_layers': 3, 'rnn_lm_tie_weights': True, 'transformer_lm_exp_dir': None, 'transformer_lm_dim_feedforward': 2048, 'transformer_lm_encoder_dim': 768, 'transformer_lm_embedding_dim': 768, 'transformer_lm_nhead': 8, 'transformer_lm_num_layers': 16, 'transformer_lm_tie_weights': True, 'res_dir': PosixPath('zipformer/exp-causal/fast_beam_search'), 'suffix': 'epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model', 'blank_id': 0, 'unk_id': 19, 'vocab_size': 38}
+2024-03-07 08:39:05,728 INFO [decode.py:780] About to create model
+2024-03-07 08:39:05,976 INFO [decode.py:847] Calculating the averaged model over epoch range from 33 (excluded) to 40
+2024-03-07 08:39:06,839 INFO [decode.py:908] Number of model parameters: 65182863
+2024-03-07 08:39:06,839 INFO [multidataset.py:81] About to get FLEURS test cuts
+2024-03-07 08:39:06,839 INFO [multidataset.py:83] Loading FLEURS in lazy mode
+2024-03-07 08:39:06,839 INFO [multidataset.py:90] About to get Common Voice test cuts
+2024-03-07 08:39:06,839 INFO [multidataset.py:92] Loading Common Voice in lazy mode
+2024-03-07 08:39:07,885 INFO [decode.py:651] batch 0/?, cuts processed until now is 11
+2024-03-07 08:39:09,245 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.2843, 3.2412, 2.7568, 2.6566], device='cuda:0')
+2024-03-07 08:39:17,182 INFO [decode.py:651] batch 20/?, cuts processed until now is 270
+2024-03-07 08:39:26,436 INFO [decode.py:651] batch 40/?, cuts processed until now is 487
+2024-03-07 08:39:26,499 INFO [decode.py:665] The transcripts are stored in zipformer/exp-causal/fast_beam_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt
+2024-03-07 08:39:26,539 INFO [utils.py:656] [test-fleurs-beam_20.0_max_contexts_8_max_states_64] %WER 6.61% [4072 / 61587, 1548 ins, 1229 del, 1295 sub ]
+2024-03-07 08:39:26,632 INFO [decode.py:676] Wrote detailed error stats to zipformer/exp-causal/fast_beam_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt
+2024-03-07 08:39:26,633 INFO [decode.py:690]
+For test-fleurs, WER of different settings are:
+beam_20.0_max_contexts_8_max_states_64	6.61	best for test-fleurs
+2024-03-07 08:39:27,522 INFO [decode.py:651] batch 0/?, cuts processed until now is 28
+2024-03-07 08:39:31,747 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([2.3396, 2.5719, 2.6157, 2.8669], device='cuda:0')
+2024-03-07 08:39:32,952 INFO [decode.py:651] batch 20/?, cuts processed until now is 628
+2024-03-07 08:39:38,196 INFO [decode.py:651] batch 40/?, cuts processed until now is 1253
+2024-03-07 08:39:43,193 INFO [decode.py:651] batch 60/?, cuts processed until now is 1940
+2024-03-07 08:39:45,893 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.1773, 2.6102, 2.7401, 2.4829], device='cuda:0')
+2024-03-07 08:39:46,485 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.7060, 2.8868, 2.8695, 2.0508], device='cuda:0')
+2024-03-07 08:39:48,212 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([4.4578, 4.5392, 4.1744, 4.1141], device='cuda:0')
+2024-03-07 08:39:48,733 INFO [decode.py:651] batch 80/?, cuts processed until now is 2513
+2024-03-07 08:39:49,794 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.1706, 2.6006, 2.7836, 2.5051], device='cuda:0')
+2024-03-07 08:39:51,031 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([4.5083, 4.5611, 4.1912, 4.1362], device='cuda:0')
+2024-03-07 08:39:52,617 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([2.1998, 2.3700, 2.7924, 2.7577], device='cuda:0')
+2024-03-07 08:39:53,711 INFO [decode.py:651] batch 100/?, cuts processed until now is 3210
+2024-03-07 08:39:59,086 INFO [decode.py:651] batch 120/?, cuts processed until now is 3814
+2024-03-07 08:40:04,040 INFO [decode.py:651] batch 140/?, cuts processed until now is 4529
+2024-03-07 08:40:08,988 INFO [decode.py:651] batch 160/?, cuts processed until now is 5256
+2024-03-07 08:40:09,916 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([4.5227, 4.0025, 4.1148, 4.3108], device='cuda:0')
+2024-03-07 08:40:11,126 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.4430, 1.6413, 1.6686, 1.8403, 1.5950, 1.9201, 2.0724, 1.6969],
+       device='cuda:0')
+2024-03-07 08:40:14,084 INFO [decode.py:651] batch 180/?, cuts processed until now is 5927
+2024-03-07 08:40:18,458 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.8823, 3.2936, 3.0880, 3.2756], device='cuda:0')
+2024-03-07 08:40:19,214 INFO [decode.py:651] batch 200/?, cuts processed until now is 6582
+2024-03-07 08:40:22,724 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.5198, 1.6440, 1.5930, 1.8840, 1.5783, 1.8650, 2.0609, 1.6564],
+       device='cuda:0')
+2024-03-07 08:40:24,453 INFO [decode.py:651] batch 220/?, cuts processed until now is 7221
+2024-03-07 08:40:29,626 INFO [decode.py:651] batch 240/?, cuts processed until now is 7878
+2024-03-07 08:40:33,319 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([2.3668, 2.7808, 2.7429, 2.0601], device='cuda:0')
+2024-03-07 08:40:34,832 INFO [decode.py:651] batch 260/?, cuts processed until now is 8528
+2024-03-07 08:40:39,681 INFO [decode.py:651] batch 280/?, cuts processed until now is 9263
+2024-03-07 08:40:44,602 INFO [decode.py:651] batch 300/?, cuts processed until now is 9972
+2024-03-07 08:40:49,952 INFO [decode.py:651] batch 320/?, cuts processed until now is 10574
+2024-03-07 08:40:51,376 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.0097, 2.9823, 2.5350, 2.4735], device='cuda:0')
+2024-03-07 08:40:54,529 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.4511, 1.6635, 1.4021, 1.3549, 1.6671, 1.8209, 1.7242, 1.7077],
+       device='cuda:0')
+2024-03-07 08:40:54,975 INFO [decode.py:651] batch 340/?, cuts processed until now is 11255
+2024-03-07 08:41:00,108 INFO [decode.py:651] batch 360/?, cuts processed until now is 11900
+2024-03-07 08:41:03,743 INFO [decode.py:665] The transcripts are stored in zipformer/exp-causal/fast_beam_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt
+2024-03-07 08:41:04,136 INFO [utils.py:656] [test-commonvoice-beam_20.0_max_contexts_8_max_states_64] %WER 7.73% [48322 / 624874, 9678 ins, 23552 del, 15092 sub ]
+2024-03-07 08:41:04,996 INFO [decode.py:676] Wrote detailed error stats to zipformer/exp-causal/fast_beam_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt
+2024-03-07 08:41:04,997 INFO [decode.py:690]
+For test-commonvoice, WER of different settings are:
+beam_20.0_max_contexts_8_max_states_64	7.73	best for test-commonvoice
+2024-03-07 08:41:04,997 INFO [decode.py:944] Done!

exp-causal/fast_beam_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/fast_beam_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/fast_beam_search/wer-summary-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ settings WER
2	+ beam_20.0_max_contexts_8_max_states_64 7.73

exp-causal/fast_beam_search/wer-summary-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ settings WER
2	+ beam_20.0_max_contexts_8_max_states_64 6.61

exp-causal/greedy_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/greedy_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/greedy_search/log-decode-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model-2024-03-07-08-38-15 ADDED Viewed

	@@ -0,0 +1,46 @@

+2024-03-07 08:38:15,365 INFO [decode.py:764] Decoding started
+2024-03-07 08:38:15,365 INFO [decode.py:770] Device: cuda:0
+2024-03-07 08:38:15,366 INFO [lexicon.py:168] Loading pre-compiled data/lang_phone/Linv.pt
+2024-03-07 08:38:15,369 INFO [decode.py:778] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.4', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': 'f6919c0ddb311bea7b53a50f3afdcb3c18b8ccc8', 'k2-git-date': 'Sat Feb 10 09:23:09 2024', 'lhotse-version': '1.22.0.dev+git.9355bd72.clean', 'torch-version': '2.0.0+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.1', 'icefall-git-branch': 'master', 'icefall-git-sha1': 'b35406b0-clean', 'icefall-git-date': 'Thu Mar 7 06:20:34 2024', 'icefall-path': '/root/icefall', 'k2-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/k2/__init__.py', 'lhotse-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': 'bookbot-h100', 'IP address': '127.0.0.1'}, 'epoch': 40, 'iter': 0, 'avg': 7, 'use_averaged_model': True, 'exp_dir': PosixPath('zipformer/exp-causal'), 'lang_dir': PosixPath('data/lang_phone'), 'decoding_method': 'greedy_search', 'beam_size': 4, 'beam': 20.0, 'ngram_lm_scale': 0.01, 'max_contexts': 8, 'max_states': 64, 'context_size': 2, 'max_sym_per_frame': 1, 'num_paths': 200, 'nbest_scale': 0.5, 'use_shallow_fusion': False, 'lm_type': 'rnn', 'lm_scale': 0.3, 'tokens_ngram': 3, 'backoff_id': 500, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': True, 'chunk_size': '32', 'left_context_frames': '128', 'use_transducer': True, 'use_ctc': True, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'lm_vocab_size': 500, 'lm_epoch': 7, 'lm_avg': 1, 'lm_exp_dir': None, 'rnn_lm_embedding_dim': 2048, 'rnn_lm_hidden_dim': 2048, 'rnn_lm_num_layers': 3, 'rnn_lm_tie_weights': True, 'transformer_lm_exp_dir': None, 'transformer_lm_dim_feedforward': 2048, 'transformer_lm_encoder_dim': 768, 'transformer_lm_embedding_dim': 768, 'transformer_lm_nhead': 8, 'transformer_lm_num_layers': 16, 'transformer_lm_tie_weights': True, 'res_dir': PosixPath('zipformer/exp-causal/greedy_search'), 'suffix': 'epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model', 'blank_id': 0, 'unk_id': 19, 'vocab_size': 38}
+2024-03-07 08:38:15,369 INFO [decode.py:780] About to create model
+2024-03-07 08:38:15,616 INFO [decode.py:847] Calculating the averaged model over epoch range from 33 (excluded) to 40
+2024-03-07 08:38:16,521 INFO [decode.py:908] Number of model parameters: 65182863
+2024-03-07 08:38:16,521 INFO [multidataset.py:81] About to get FLEURS test cuts
+2024-03-07 08:38:16,521 INFO [multidataset.py:83] Loading FLEURS in lazy mode
+2024-03-07 08:38:16,522 INFO [multidataset.py:90] About to get Common Voice test cuts
+2024-03-07 08:38:16,522 INFO [multidataset.py:92] Loading Common Voice in lazy mode
+2024-03-07 08:38:17,254 INFO [decode.py:651] batch 0/?, cuts processed until now is 11
+2024-03-07 08:38:21,746 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.6919, 1.8132, 1.6442, 1.9278, 2.0388, 1.9352, 2.1089, 1.7078],
+       device='cuda:0')
+2024-03-07 08:38:23,390 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([4.4804, 4.7386, 4.4316, 4.4934], device='cuda:0')
+2024-03-07 08:38:23,574 INFO [decode.py:665] The transcripts are stored in zipformer/exp-causal/greedy_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt
+2024-03-07 08:38:23,615 INFO [utils.py:656] [test-fleurs-greedy_search] %WER 6.58% [4054 / 61587, 1612 ins, 1154 del, 1288 sub ]
+2024-03-07 08:38:23,708 INFO [decode.py:676] Wrote detailed error stats to zipformer/exp-causal/greedy_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt
+2024-03-07 08:38:23,708 INFO [decode.py:690]
+For test-fleurs, WER of different settings are:
+greedy_search	6.58	best for test-fleurs
+2024-03-07 08:38:24,424 INFO [decode.py:651] batch 0/?, cuts processed until now is 28
+2024-03-07 08:38:26,567 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.7743, 3.4609, 3.1923, 3.4857], device='cuda:0')
+2024-03-07 08:38:29,582 INFO [decode.py:651] batch 50/?, cuts processed until now is 1611
+2024-03-07 08:38:34,671 INFO [decode.py:651] batch 100/?, cuts processed until now is 3210
+2024-03-07 08:38:38,604 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.8230, 2.9017, 2.8725, 2.2266], device='cuda:0')
+2024-03-07 08:38:39,685 INFO [decode.py:651] batch 150/?, cuts processed until now is 4896
+2024-03-07 08:38:44,671 INFO [decode.py:651] batch 200/?, cuts processed until now is 6582
+2024-03-07 08:38:49,727 INFO [decode.py:651] batch 250/?, cuts processed until now is 8173
+2024-03-07 08:38:50,536 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.2378, 3.0733, 2.4014, 2.7650], device='cuda:0')
+2024-03-07 08:38:50,823 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.5433, 1.8018, 1.6968, 1.5210, 1.7601, 1.7079, 1.5636, 1.8637],
+       device='cuda:0')
+2024-03-07 08:38:51,210 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.0157, 2.9472, 2.5216, 2.4810], device='cuda:0')
+2024-03-07 08:38:54,561 INFO [decode.py:651] batch 300/?, cuts processed until now is 9972
+2024-03-07 08:38:55,938 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([4.0921, 4.2424, 3.6664, 3.7351], device='cuda:0')
+2024-03-07 08:38:59,584 INFO [decode.py:651] batch 350/?, cuts processed until now is 11592
+2024-03-07 08:39:00,118 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([4.5326, 4.0145, 4.0918, 4.3280], device='cuda:0')
+2024-03-07 08:39:02,056 INFO [decode.py:665] The transcripts are stored in zipformer/exp-causal/greedy_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt
+2024-03-07 08:39:02,445 INFO [utils.py:656] [test-commonvoice-greedy_search] %WER 7.71% [48192 / 624874, 10414 ins, 21811 del, 15967 sub ]
+2024-03-07 08:39:03,298 INFO [decode.py:676] Wrote detailed error stats to zipformer/exp-causal/greedy_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt
+2024-03-07 08:39:03,298 INFO [decode.py:690]
+For test-commonvoice, WER of different settings are:
+greedy_search	7.71	best for test-commonvoice
+2024-03-07 08:39:03,299 INFO [decode.py:944] Done!

exp-causal/greedy_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/greedy_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/greedy_search/wer-summary-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ settings WER
2	+ greedy_search 7.71

exp-causal/greedy_search/wer-summary-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-context-2-max-sym-per-frame-1-use-averaged-model.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ settings WER
2	+ greedy_search 6.58

exp-causal/jit_script_chunk_32_left_128.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1f7af6590fca59f1026cad0a0a2abee9332f146142d8e5c6d0b4975b6ce35f97
+size 263594678

exp-causal/modified_beam_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/modified_beam_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/modified_beam_search/log-decode-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model-2024-03-07-08-41-07 ADDED Viewed

	@@ -0,0 +1,56 @@

+2024-03-07 08:41:07,436 INFO [decode.py:764] Decoding started
+2024-03-07 08:41:07,436 INFO [decode.py:770] Device: cuda:0
+2024-03-07 08:41:07,437 INFO [lexicon.py:168] Loading pre-compiled data/lang_phone/Linv.pt
+2024-03-07 08:41:07,437 INFO [decode.py:778] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.4', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': 'f6919c0ddb311bea7b53a50f3afdcb3c18b8ccc8', 'k2-git-date': 'Sat Feb 10 09:23:09 2024', 'lhotse-version': '1.22.0.dev+git.9355bd72.clean', 'torch-version': '2.0.0+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.1', 'icefall-git-branch': 'master', 'icefall-git-sha1': 'b35406b0-clean', 'icefall-git-date': 'Thu Mar 7 06:20:34 2024', 'icefall-path': '/root/icefall', 'k2-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/k2/__init__.py', 'lhotse-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': 'bookbot-h100', 'IP address': '127.0.0.1'}, 'epoch': 40, 'iter': 0, 'avg': 7, 'use_averaged_model': True, 'exp_dir': PosixPath('zipformer/exp-causal'), 'lang_dir': PosixPath('data/lang_phone'), 'decoding_method': 'modified_beam_search', 'beam_size': 4, 'beam': 20.0, 'ngram_lm_scale': 0.01, 'max_contexts': 8, 'max_states': 64, 'context_size': 2, 'max_sym_per_frame': 1, 'num_paths': 200, 'nbest_scale': 0.5, 'use_shallow_fusion': False, 'lm_type': 'rnn', 'lm_scale': 0.3, 'tokens_ngram': 3, 'backoff_id': 500, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': True, 'chunk_size': '32', 'left_context_frames': '128', 'use_transducer': True, 'use_ctc': True, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'lm_vocab_size': 500, 'lm_epoch': 7, 'lm_avg': 1, 'lm_exp_dir': None, 'rnn_lm_embedding_dim': 2048, 'rnn_lm_hidden_dim': 2048, 'rnn_lm_num_layers': 3, 'rnn_lm_tie_weights': True, 'transformer_lm_exp_dir': None, 'transformer_lm_dim_feedforward': 2048, 'transformer_lm_encoder_dim': 768, 'transformer_lm_embedding_dim': 768, 'transformer_lm_nhead': 8, 'transformer_lm_num_layers': 16, 'transformer_lm_tie_weights': True, 'res_dir': PosixPath('zipformer/exp-causal/modified_beam_search'), 'suffix': 'epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model', 'blank_id': 0, 'unk_id': 19, 'vocab_size': 38}
+2024-03-07 08:41:07,438 INFO [decode.py:780] About to create model
+2024-03-07 08:41:07,691 INFO [decode.py:847] Calculating the averaged model over epoch range from 33 (excluded) to 40
+2024-03-07 08:41:08,529 INFO [decode.py:908] Number of model parameters: 65182863
+2024-03-07 08:41:08,529 INFO [multidataset.py:81] About to get FLEURS test cuts
+2024-03-07 08:41:08,529 INFO [multidataset.py:83] Loading FLEURS in lazy mode
+2024-03-07 08:41:08,530 INFO [multidataset.py:90] About to get Common Voice test cuts
+2024-03-07 08:41:08,530 INFO [multidataset.py:92] Loading Common Voice in lazy mode
+2024-03-07 08:41:10,007 INFO [decode.py:651] batch 0/?, cuts processed until now is 11
+2024-03-07 08:41:24,636 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.9650, 3.6353, 3.4530, 3.6656], device='cuda:0')
+2024-03-07 08:41:27,015 INFO [decode.py:651] batch 20/?, cuts processed until now is 270
+2024-03-07 08:41:41,955 INFO [decode.py:651] batch 40/?, cuts processed until now is 487
+2024-03-07 08:41:42,017 INFO [decode.py:665] The transcripts are stored in zipformer/exp-causal/modified_beam_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt
+2024-03-07 08:41:42,058 INFO [utils.py:656] [test-fleurs-beam_size_4] %WER 6.40% [3942 / 61587, 1687 ins, 950 del, 1305 sub ]
+2024-03-07 08:41:42,153 INFO [decode.py:676] Wrote detailed error stats to zipformer/exp-causal/modified_beam_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt
+2024-03-07 08:41:42,153 INFO [decode.py:690]
+For test-fleurs, WER of different settings are:
+beam_size_4	6.4	best for test-fleurs
+2024-03-07 08:41:43,500 INFO [decode.py:651] batch 0/?, cuts processed until now is 28
+2024-03-07 08:41:57,852 INFO [decode.py:651] batch 20/?, cuts processed until now is 628
+2024-03-07 08:42:07,812 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([2.1801, 2.3482, 2.7994, 2.7716], device='cuda:0')
+2024-03-07 08:42:12,022 INFO [decode.py:651] batch 40/?, cuts processed until now is 1253
+2024-03-07 08:42:21,068 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.6309, 1.6829, 1.5267, 1.7630, 1.8531, 1.8195, 1.8756, 1.6031],
+       device='cuda:0')
+2024-03-07 08:42:25,949 INFO [decode.py:651] batch 60/?, cuts processed until now is 1940
+2024-03-07 08:42:36,123 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.2243, 3.0776, 2.3565, 2.7223], device='cuda:0')
+2024-03-07 08:42:40,387 INFO [decode.py:651] batch 80/?, cuts processed until now is 2513
+2024-03-07 08:42:54,535 INFO [decode.py:651] batch 100/?, cuts processed until now is 3210
+2024-03-07 08:43:08,702 INFO [decode.py:651] batch 120/?, cuts processed until now is 3814
+2024-03-07 08:43:22,478 INFO [decode.py:651] batch 140/?, cuts processed until now is 4529
+2024-03-07 08:43:30,734 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.2739, 2.6799, 2.8831, 2.6336], device='cuda:0')
+2024-03-07 08:43:36,235 INFO [decode.py:651] batch 160/?, cuts processed until now is 5256
+2024-03-07 08:43:38,339 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.0916, 2.5364, 2.6418, 2.3536], device='cuda:0')
+2024-03-07 08:43:50,472 INFO [decode.py:651] batch 180/?, cuts processed until now is 5927
+2024-03-07 08:44:04,600 INFO [decode.py:651] batch 200/?, cuts processed until now is 6582
+2024-03-07 08:44:18,698 INFO [decode.py:651] batch 220/?, cuts processed until now is 7221
+2024-03-07 08:44:32,720 INFO [decode.py:651] batch 240/?, cuts processed until now is 7878
+2024-03-07 08:44:46,812 INFO [decode.py:651] batch 260/?, cuts processed until now is 8528
+2024-03-07 08:45:00,678 INFO [decode.py:651] batch 280/?, cuts processed until now is 9263
+2024-03-07 08:45:02,883 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([2.2417, 2.2166, 2.4622, 2.6655], device='cuda:0')
+2024-03-07 08:45:14,478 INFO [decode.py:651] batch 300/?, cuts processed until now is 9972
+2024-03-07 08:45:28,719 INFO [decode.py:651] batch 320/?, cuts processed until now is 10574
+2024-03-07 08:45:42,779 INFO [decode.py:651] batch 340/?, cuts processed until now is 11255
+2024-03-07 08:45:56,922 INFO [decode.py:651] batch 360/?, cuts processed until now is 11900
+2024-03-07 08:46:05,025 INFO [decode.py:665] The transcripts are stored in zipformer/exp-causal/modified_beam_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt
+2024-03-07 08:46:05,434 INFO [utils.py:656] [test-commonvoice-beam_size_4] %WER 7.53% [47039 / 624874, 11050 ins, 20152 del, 15837 sub ]
+2024-03-07 08:46:06,304 INFO [decode.py:676] Wrote detailed error stats to zipformer/exp-causal/modified_beam_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt
+2024-03-07 08:46:06,304 INFO [decode.py:690]
+For test-commonvoice, WER of different settings are:
+beam_size_4	7.53	best for test-commonvoice
+2024-03-07 08:46:06,304 INFO [decode.py:944] Done!

exp-causal/modified_beam_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/modified_beam_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/modified_beam_search/wer-summary-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ settings WER
2	+ beam_size_4 7.53

exp-causal/modified_beam_search/wer-summary-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-modified_beam_search-beam-size-4-use-averaged-model.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ settings WER
2	+ beam_size_4 6.4

exp-causal/pretrained.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1a57d5529d997fe029a18bccae562257f58bdc66d4eb2648dc4526c66618e8d8
+size 261184016

exp-causal/streaming/fast_beam_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/streaming/fast_beam_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/streaming/fast_beam_search/log-decode-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model-2024-03-07-08-56-46 ADDED Viewed

	@@ -0,0 +1,154 @@

+2024-03-07 08:56:46,168 INFO [streaming_decode.py:723] Decoding started
+2024-03-07 08:56:46,168 INFO [streaming_decode.py:729] Device: cuda:0
+2024-03-07 08:56:46,168 INFO [lexicon.py:168] Loading pre-compiled data/lang_phone/Linv.pt
+2024-03-07 08:56:46,170 INFO [streaming_decode.py:737] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.4', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': 'f6919c0ddb311bea7b53a50f3afdcb3c18b8ccc8', 'k2-git-date': 'Sat Feb 10 09:23:09 2024', 'lhotse-version': '1.22.0.dev+git.9355bd72.clean', 'torch-version': '2.0.0+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.1', 'icefall-git-branch': 'master', 'icefall-git-sha1': 'b35406b0-dirty', 'icefall-git-date': 'Thu Mar 7 06:20:34 2024', 'icefall-path': '/root/icefall', 'k2-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/k2/__init__.py', 'lhotse-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': 'bookbot-h100', 'IP address': '127.0.0.1'}, 'epoch': 40, 'iter': 0, 'avg': 7, 'use_averaged_model': True, 'exp_dir': PosixPath('zipformer/exp-causal'), 'lang_dir': PosixPath('data/lang_phone'), 'decoding_method': 'fast_beam_search', 'num_active_paths': 4, 'beam': 4, 'max_contexts': 4, 'max_states': 32, 'context_size': 2, 'num_decode_streams': 1000, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': True, 'chunk_size': '32', 'left_context_frames': '128', 'use_transducer': True, 'use_ctc': True, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'res_dir': PosixPath('zipformer/exp-causal/streaming/fast_beam_search'), 'suffix': 'epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model', 'blank_id': 0, 'unk_id': 19, 'vocab_size': 38}
+2024-03-07 08:56:46,170 INFO [streaming_decode.py:739] About to create model
+2024-03-07 08:56:46,400 INFO [streaming_decode.py:806] Calculating the averaged model over epoch range from 33 (excluded) to 40
+2024-03-07 08:56:47,216 INFO [streaming_decode.py:828] Number of model parameters: 65182863
+2024-03-07 08:56:47,216 INFO [multidataset.py:81] About to get FLEURS test cuts
+2024-03-07 08:56:47,216 INFO [multidataset.py:83] Loading FLEURS in lazy mode
+2024-03-07 08:56:47,217 INFO [multidataset.py:90] About to get Common Voice test cuts
+2024-03-07 08:56:47,217 INFO [multidataset.py:92] Loading Common Voice in lazy mode
+2024-03-07 08:56:47,250 INFO [streaming_decode.py:615] Cuts processed until now is 0.
+2024-03-07 08:56:47,505 INFO [streaming_decode.py:615] Cuts processed until now is 100.
+2024-03-07 08:56:47,761 INFO [streaming_decode.py:615] Cuts processed until now is 200.
+2024-03-07 08:56:48,125 INFO [streaming_decode.py:615] Cuts processed until now is 300.
+2024-03-07 08:56:48,388 INFO [streaming_decode.py:615] Cuts processed until now is 400.
+2024-03-07 08:56:59,328 INFO [streaming_decode.py:660] The transcripts are stored in zipformer/exp-causal/streaming/fast_beam_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt
+2024-03-07 08:56:59,374 INFO [utils.py:656] [test-fleurs-beam_4_max_contexts_4_max_states_32] %WER 6.44% [3966 / 61587, 1562 ins, 1114 del, 1290 sub ]
+2024-03-07 08:56:59,473 INFO [streaming_decode.py:671] Wrote detailed error stats to zipformer/exp-causal/streaming/fast_beam_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt
+2024-03-07 08:56:59,473 INFO [streaming_decode.py:685]
+For test-fleurs, WER of different settings are:
+beam_4_max_contexts_4_max_states_32	6.44	best for test-fleurs
+2024-03-07 08:56:59,533 INFO [streaming_decode.py:615] Cuts processed until now is 0.
+2024-03-07 08:56:59,987 INFO [streaming_decode.py:615] Cuts processed until now is 100.
+2024-03-07 08:57:00,402 INFO [streaming_decode.py:615] Cuts processed until now is 200.
+2024-03-07 08:57:00,805 INFO [streaming_decode.py:615] Cuts processed until now is 300.
+2024-03-07 08:57:01,205 INFO [streaming_decode.py:615] Cuts processed until now is 400.
+2024-03-07 08:57:01,610 INFO [streaming_decode.py:615] Cuts processed until now is 500.
+2024-03-07 08:57:02,035 INFO [streaming_decode.py:615] Cuts processed until now is 600.
+2024-03-07 08:57:02,444 INFO [streaming_decode.py:615] Cuts processed until now is 700.
+2024-03-07 08:57:02,848 INFO [streaming_decode.py:615] Cuts processed until now is 800.
+2024-03-07 08:57:03,267 INFO [streaming_decode.py:615] Cuts processed until now is 900.
+2024-03-07 08:57:06,208 INFO [streaming_decode.py:615] Cuts processed until now is 1000.
+2024-03-07 08:57:07,867 INFO [streaming_decode.py:615] Cuts processed until now is 1100.
+2024-03-07 08:57:09,022 INFO [streaming_decode.py:615] Cuts processed until now is 1200.
+2024-03-07 08:57:10,227 INFO [streaming_decode.py:615] Cuts processed until now is 1300.
+2024-03-07 08:57:11,297 INFO [streaming_decode.py:615] Cuts processed until now is 1400.
+2024-03-07 08:57:12,537 INFO [streaming_decode.py:615] Cuts processed until now is 1500.
+2024-03-07 08:57:13,834 INFO [streaming_decode.py:615] Cuts processed until now is 1600.
+2024-03-07 08:57:14,242 INFO [streaming_decode.py:615] Cuts processed until now is 1700.
+2024-03-07 08:57:15,348 INFO [streaming_decode.py:615] Cuts processed until now is 1800.
+2024-03-07 08:57:16,629 INFO [streaming_decode.py:615] Cuts processed until now is 1900.
+2024-03-07 08:57:17,927 INFO [streaming_decode.py:615] Cuts processed until now is 2000.
+2024-03-07 08:57:19,033 INFO [streaming_decode.py:615] Cuts processed until now is 2100.
+2024-03-07 08:57:20,293 INFO [streaming_decode.py:615] Cuts processed until now is 2200.
+2024-03-07 08:57:21,399 INFO [streaming_decode.py:615] Cuts processed until now is 2300.
+2024-03-07 08:57:23,520 INFO [streaming_decode.py:615] Cuts processed until now is 2400.
+2024-03-07 08:57:24,627 INFO [streaming_decode.py:615] Cuts processed until now is 2500.
+2024-03-07 08:57:25,903 INFO [streaming_decode.py:615] Cuts processed until now is 2600.
+2024-03-07 08:57:27,193 INFO [streaming_decode.py:615] Cuts processed until now is 2700.
+2024-03-07 08:57:28,345 INFO [streaming_decode.py:615] Cuts processed until now is 2800.
+2024-03-07 08:57:28,729 INFO [streaming_decode.py:615] Cuts processed until now is 2900.
+2024-03-07 08:57:30,011 INFO [streaming_decode.py:615] Cuts processed until now is 3000.
+2024-03-07 08:57:31,313 INFO [streaming_decode.py:615] Cuts processed until now is 3100.
+2024-03-07 08:57:32,611 INFO [streaming_decode.py:615] Cuts processed until now is 3200.
+2024-03-07 08:57:33,723 INFO [streaming_decode.py:615] Cuts processed until now is 3300.
+2024-03-07 08:57:35,022 INFO [streaming_decode.py:615] Cuts processed until now is 3400.
+2024-03-07 08:57:36,346 INFO [streaming_decode.py:615] Cuts processed until now is 3500.
+2024-03-07 08:57:37,464 INFO [streaming_decode.py:615] Cuts processed until now is 3600.
+2024-03-07 08:57:38,782 INFO [streaming_decode.py:615] Cuts processed until now is 3700.
+2024-03-07 08:57:40,089 INFO [streaming_decode.py:615] Cuts processed until now is 3800.
+2024-03-07 08:57:41,219 INFO [streaming_decode.py:615] Cuts processed until now is 3900.
+2024-03-07 08:57:42,516 INFO [streaming_decode.py:615] Cuts processed until now is 4000.
+2024-03-07 08:57:43,861 INFO [streaming_decode.py:615] Cuts processed until now is 4100.
+2024-03-07 08:57:44,985 INFO [streaming_decode.py:615] Cuts processed until now is 4200.
+2024-03-07 08:57:46,301 INFO [streaming_decode.py:615] Cuts processed until now is 4300.
+2024-03-07 08:57:47,589 INFO [streaming_decode.py:615] Cuts processed until now is 4400.
+2024-03-07 08:57:48,724 INFO [streaming_decode.py:615] Cuts processed until now is 4500.
+2024-03-07 08:57:50,031 INFO [streaming_decode.py:615] Cuts processed until now is 4600.
+2024-03-07 08:57:51,375 INFO [streaming_decode.py:615] Cuts processed until now is 4700.
+2024-03-07 08:57:52,513 INFO [streaming_decode.py:615] Cuts processed until now is 4800.
+2024-03-07 08:57:53,839 INFO [streaming_decode.py:615] Cuts processed until now is 4900.
+2024-03-07 08:57:54,958 INFO [streaming_decode.py:615] Cuts processed until now is 5000.
+2024-03-07 08:57:56,282 INFO [streaming_decode.py:615] Cuts processed until now is 5100.
+2024-03-07 08:57:57,610 INFO [streaming_decode.py:615] Cuts processed until now is 5200.
+2024-03-07 08:57:58,744 INFO [streaming_decode.py:615] Cuts processed until now is 5300.
+2024-03-07 08:58:00,031 INFO [streaming_decode.py:615] Cuts processed until now is 5400.
+2024-03-07 08:58:01,344 INFO [streaming_decode.py:615] Cuts processed until now is 5500.
+2024-03-07 08:58:02,494 INFO [streaming_decode.py:615] Cuts processed until now is 5600.
+2024-03-07 08:58:03,800 INFO [streaming_decode.py:615] Cuts processed until now is 5700.
+2024-03-07 08:58:05,143 INFO [streaming_decode.py:615] Cuts processed until now is 5800.
+2024-03-07 08:58:06,296 INFO [streaming_decode.py:615] Cuts processed until now is 5900.
+2024-03-07 08:58:07,598 INFO [streaming_decode.py:615] Cuts processed until now is 6000.
+2024-03-07 08:58:08,934 INFO [streaming_decode.py:615] Cuts processed until now is 6100.
+2024-03-07 08:58:10,061 INFO [streaming_decode.py:615] Cuts processed until now is 6200.
+2024-03-07 08:58:11,363 INFO [streaming_decode.py:615] Cuts processed until now is 6300.
+2024-03-07 08:58:12,697 INFO [streaming_decode.py:615] Cuts processed until now is 6400.
+2024-03-07 08:58:13,841 INFO [streaming_decode.py:615] Cuts processed until now is 6500.
+2024-03-07 08:58:15,142 INFO [streaming_decode.py:615] Cuts processed until now is 6600.
+2024-03-07 08:58:16,473 INFO [streaming_decode.py:615] Cuts processed until now is 6700.
+2024-03-07 08:58:17,624 INFO [streaming_decode.py:615] Cuts processed until now is 6800.
+2024-03-07 08:58:18,925 INFO [streaming_decode.py:615] Cuts processed until now is 6900.
+2024-03-07 08:58:20,264 INFO [streaming_decode.py:615] Cuts processed until now is 7000.
+2024-03-07 08:58:21,416 INFO [streaming_decode.py:615] Cuts processed until now is 7100.
+2024-03-07 08:58:22,705 INFO [streaming_decode.py:615] Cuts processed until now is 7200.
+2024-03-07 08:58:24,056 INFO [streaming_decode.py:615] Cuts processed until now is 7300.
+2024-03-07 08:58:25,222 INFO [streaming_decode.py:615] Cuts processed until now is 7400.
+2024-03-07 08:58:26,532 INFO [streaming_decode.py:615] Cuts processed until now is 7500.
+2024-03-07 08:58:27,880 INFO [streaming_decode.py:615] Cuts processed until now is 7600.
+2024-03-07 08:58:29,014 INFO [streaming_decode.py:615] Cuts processed until now is 7700.
+2024-03-07 08:58:30,320 INFO [streaming_decode.py:615] Cuts processed until now is 7800.
+2024-03-07 08:58:31,679 INFO [streaming_decode.py:615] Cuts processed until now is 7900.
+2024-03-07 08:58:32,790 INFO [streaming_decode.py:615] Cuts processed until now is 8000.
+2024-03-07 08:58:34,120 INFO [streaming_decode.py:615] Cuts processed until now is 8100.
+2024-03-07 08:58:35,488 INFO [streaming_decode.py:615] Cuts processed until now is 8200.
+2024-03-07 08:58:36,616 INFO [streaming_decode.py:615] Cuts processed until now is 8300.
+2024-03-07 08:58:37,953 INFO [streaming_decode.py:615] Cuts processed until now is 8400.
+2024-03-07 08:58:39,086 INFO [streaming_decode.py:615] Cuts processed until now is 8500.
+2024-03-07 08:58:40,391 INFO [streaming_decode.py:615] Cuts processed until now is 8600.
+2024-03-07 08:58:41,731 INFO [streaming_decode.py:615] Cuts processed until now is 8700.
+2024-03-07 08:58:42,864 INFO [streaming_decode.py:615] Cuts processed until now is 8800.
+2024-03-07 08:58:44,169 INFO [streaming_decode.py:615] Cuts processed until now is 8900.
+2024-03-07 08:58:45,542 INFO [streaming_decode.py:615] Cuts processed until now is 9000.
+2024-03-07 08:58:46,679 INFO [streaming_decode.py:615] Cuts processed until now is 9100.
+2024-03-07 08:58:47,998 INFO [streaming_decode.py:615] Cuts processed until now is 9200.
+2024-03-07 08:58:49,366 INFO [streaming_decode.py:615] Cuts processed until now is 9300.
+2024-03-07 08:58:50,494 INFO [streaming_decode.py:615] Cuts processed until now is 9400.
+2024-03-07 08:58:51,824 INFO [streaming_decode.py:615] Cuts processed until now is 9500.
+2024-03-07 08:58:53,228 INFO [streaming_decode.py:615] Cuts processed until now is 9600.
+2024-03-07 08:58:54,385 INFO [streaming_decode.py:615] Cuts processed until now is 9700.
+2024-03-07 08:58:55,761 INFO [streaming_decode.py:615] Cuts processed until now is 9800.
+2024-03-07 08:58:56,178 INFO [streaming_decode.py:615] Cuts processed until now is 9900.
+2024-03-07 08:58:57,326 INFO [streaming_decode.py:615] Cuts processed until now is 10000.
+2024-03-07 08:58:59,584 INFO [streaming_decode.py:615] Cuts processed until now is 10100.
+2024-03-07 08:59:00,723 INFO [streaming_decode.py:615] Cuts processed until now is 10200.
+2024-03-07 08:59:02,083 INFO [streaming_decode.py:615] Cuts processed until now is 10300.
+2024-03-07 08:59:03,461 INFO [streaming_decode.py:615] Cuts processed until now is 10400.
+2024-03-07 08:59:04,588 INFO [streaming_decode.py:615] Cuts processed until now is 10500.
+2024-03-07 08:59:05,955 INFO [streaming_decode.py:615] Cuts processed until now is 10600.
+2024-03-07 08:59:07,086 INFO [streaming_decode.py:615] Cuts processed until now is 10700.
+2024-03-07 08:59:08,428 INFO [streaming_decode.py:615] Cuts processed until now is 10800.
+2024-03-07 08:59:09,807 INFO [streaming_decode.py:615] Cuts processed until now is 10900.
+2024-03-07 08:59:10,955 INFO [streaming_decode.py:615] Cuts processed until now is 11000.
+2024-03-07 08:59:12,305 INFO [streaming_decode.py:615] Cuts processed until now is 11100.
+2024-03-07 08:59:13,713 INFO [streaming_decode.py:615] Cuts processed until now is 11200.
+2024-03-07 08:59:14,126 INFO [streaming_decode.py:615] Cuts processed until now is 11300.
+2024-03-07 08:59:16,212 INFO [streaming_decode.py:615] Cuts processed until now is 11400.
+2024-03-07 08:59:17,370 INFO [streaming_decode.py:615] Cuts processed until now is 11500.
+2024-03-07 08:59:18,717 INFO [streaming_decode.py:615] Cuts processed until now is 11600.
+2024-03-07 08:59:19,119 INFO [streaming_decode.py:615] Cuts processed until now is 11700.
+2024-03-07 08:59:20,499 INFO [streaming_decode.py:615] Cuts processed until now is 11800.
+2024-03-07 08:59:21,634 INFO [streaming_decode.py:615] Cuts processed until now is 11900.
+2024-03-07 08:59:22,973 INFO [streaming_decode.py:615] Cuts processed until now is 12000.
+2024-03-07 08:59:25,070 INFO [streaming_decode.py:615] Cuts processed until now is 12100.
+2024-03-07 08:59:26,456 INFO [streaming_decode.py:615] Cuts processed until now is 12200.
+2024-03-07 08:59:32,292 INFO [streaming_decode.py:660] The transcripts are stored in zipformer/exp-causal/streaming/fast_beam_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt
+2024-03-07 08:59:32,774 INFO [utils.py:656] [test-commonvoice-beam_4_max_contexts_4_max_states_32] %WER 7.72% [48248 / 624874, 9842 ins, 22929 del, 15477 sub ]
+2024-03-07 08:59:33,754 INFO [streaming_decode.py:671] Wrote detailed error stats to zipformer/exp-causal/streaming/fast_beam_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt
+2024-03-07 08:59:33,754 INFO [streaming_decode.py:685]
+For test-commonvoice, WER of different settings are:
+beam_4_max_contexts_4_max_states_32	7.72	best for test-commonvoice
+2024-03-07 08:59:33,754 INFO [streaming_decode.py:853] Done!

exp-causal/streaming/fast_beam_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/streaming/fast_beam_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/streaming/fast_beam_search/wer-summary-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ settings WER
2	+ beam_4_max_contexts_4_max_states_32 7.72

exp-causal/streaming/fast_beam_search/wer-summary-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-4-max-contexts-4-max-states-32-use-averaged-model.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ settings WER
2	+ beam_4_max_contexts_4_max_states_32 6.44

exp-causal/streaming/greedy_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/streaming/greedy_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

exp-causal/streaming/greedy_search/log-decode-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model-2024-03-07-08-54-56 ADDED Viewed

	@@ -0,0 +1,154 @@

+2024-03-07 08:54:56,454 INFO [streaming_decode.py:723] Decoding started
+2024-03-07 08:54:56,455 INFO [streaming_decode.py:729] Device: cuda:0
+2024-03-07 08:54:56,455 INFO [lexicon.py:168] Loading pre-compiled data/lang_phone/Linv.pt
+2024-03-07 08:54:56,457 INFO [streaming_decode.py:737] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.4', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': 'f6919c0ddb311bea7b53a50f3afdcb3c18b8ccc8', 'k2-git-date': 'Sat Feb 10 09:23:09 2024', 'lhotse-version': '1.22.0.dev+git.9355bd72.clean', 'torch-version': '2.0.0+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.1', 'icefall-git-branch': 'master', 'icefall-git-sha1': 'b35406b0-dirty', 'icefall-git-date': 'Thu Mar 7 06:20:34 2024', 'icefall-path': '/root/icefall', 'k2-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/k2/__init__.py', 'lhotse-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': 'bookbot-h100', 'IP address': '127.0.0.1'}, 'epoch': 40, 'iter': 0, 'avg': 7, 'use_averaged_model': True, 'exp_dir': PosixPath('zipformer/exp-causal'), 'lang_dir': PosixPath('data/lang_phone'), 'decoding_method': 'greedy_search', 'num_active_paths': 4, 'beam': 4, 'max_contexts': 4, 'max_states': 32, 'context_size': 2, 'num_decode_streams': 1000, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': True, 'chunk_size': '32', 'left_context_frames': '128', 'use_transducer': True, 'use_ctc': True, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'res_dir': PosixPath('zipformer/exp-causal/streaming/greedy_search'), 'suffix': 'epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model', 'blank_id': 0, 'unk_id': 19, 'vocab_size': 38}
+2024-03-07 08:54:56,457 INFO [streaming_decode.py:739] About to create model
+2024-03-07 08:54:56,690 INFO [streaming_decode.py:806] Calculating the averaged model over epoch range from 33 (excluded) to 40
+2024-03-07 08:54:57,485 INFO [streaming_decode.py:828] Number of model parameters: 65182863
+2024-03-07 08:54:57,485 INFO [multidataset.py:81] About to get FLEURS test cuts
+2024-03-07 08:54:57,485 INFO [multidataset.py:83] Loading FLEURS in lazy mode
+2024-03-07 08:54:57,486 INFO [multidataset.py:90] About to get Common Voice test cuts
+2024-03-07 08:54:57,486 INFO [multidataset.py:92] Loading Common Voice in lazy mode
+2024-03-07 08:54:57,520 INFO [streaming_decode.py:615] Cuts processed until now is 0.
+2024-03-07 08:54:57,771 INFO [streaming_decode.py:615] Cuts processed until now is 100.
+2024-03-07 08:54:58,025 INFO [streaming_decode.py:615] Cuts processed until now is 200.
+2024-03-07 08:54:58,389 INFO [streaming_decode.py:615] Cuts processed until now is 300.
+2024-03-07 08:54:58,649 INFO [streaming_decode.py:615] Cuts processed until now is 400.
+2024-03-07 08:55:03,961 INFO [streaming_decode.py:660] The transcripts are stored in zipformer/exp-causal/streaming/greedy_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
+2024-03-07 08:55:04,004 INFO [utils.py:656] [test-fleurs-greedy_search] %WER 6.59% [4058 / 61587, 1608 ins, 1149 del, 1301 sub ]
+2024-03-07 08:55:04,100 INFO [streaming_decode.py:671] Wrote detailed error stats to zipformer/exp-causal/streaming/greedy_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
+2024-03-07 08:55:04,101 INFO [streaming_decode.py:685]
+For test-fleurs, WER of different settings are:
+greedy_search	6.59	best for test-fleurs
+2024-03-07 08:55:04,160 INFO [streaming_decode.py:615] Cuts processed until now is 0.
+2024-03-07 08:55:04,586 INFO [streaming_decode.py:615] Cuts processed until now is 100.
+2024-03-07 08:55:04,991 INFO [streaming_decode.py:615] Cuts processed until now is 200.
+2024-03-07 08:55:05,379 INFO [streaming_decode.py:615] Cuts processed until now is 300.
+2024-03-07 08:55:05,766 INFO [streaming_decode.py:615] Cuts processed until now is 400.
+2024-03-07 08:55:06,160 INFO [streaming_decode.py:615] Cuts processed until now is 500.
+2024-03-07 08:55:06,568 INFO [streaming_decode.py:615] Cuts processed until now is 600.
+2024-03-07 08:55:06,968 INFO [streaming_decode.py:615] Cuts processed until now is 700.
+2024-03-07 08:55:07,371 INFO [streaming_decode.py:615] Cuts processed until now is 800.
+2024-03-07 08:55:07,784 INFO [streaming_decode.py:615] Cuts processed until now is 900.
+2024-03-07 08:55:09,627 INFO [streaming_decode.py:615] Cuts processed until now is 1000.
+2024-03-07 08:55:10,682 INFO [streaming_decode.py:615] Cuts processed until now is 1100.
+2024-03-07 08:55:11,495 INFO [streaming_decode.py:615] Cuts processed until now is 1200.
+2024-03-07 08:55:12,174 INFO [streaming_decode.py:615] Cuts processed until now is 1300.
+2024-03-07 08:55:12,992 INFO [streaming_decode.py:615] Cuts processed until now is 1400.
+2024-03-07 08:55:13,818 INFO [streaming_decode.py:615] Cuts processed until now is 1500.
+2024-03-07 08:55:14,510 INFO [streaming_decode.py:615] Cuts processed until now is 1600.
+2024-03-07 08:55:15,045 INFO [streaming_decode.py:615] Cuts processed until now is 1700.
+2024-03-07 08:55:15,725 INFO [streaming_decode.py:615] Cuts processed until now is 1800.
+2024-03-07 08:55:16,557 INFO [streaming_decode.py:615] Cuts processed until now is 1900.
+2024-03-07 08:55:17,411 INFO [streaming_decode.py:615] Cuts processed until now is 2000.
+2024-03-07 08:55:18,087 INFO [streaming_decode.py:615] Cuts processed until now is 2100.
+2024-03-07 08:55:18,919 INFO [streaming_decode.py:615] Cuts processed until now is 2200.
+2024-03-07 08:55:19,591 INFO [streaming_decode.py:615] Cuts processed until now is 2300.
+2024-03-07 08:55:20,850 INFO [streaming_decode.py:615] Cuts processed until now is 2400.
+2024-03-07 08:55:21,527 INFO [streaming_decode.py:615] Cuts processed until now is 2500.
+2024-03-07 08:55:22,359 INFO [streaming_decode.py:615] Cuts processed until now is 2600.
+2024-03-07 08:55:23,201 INFO [streaming_decode.py:615] Cuts processed until now is 2700.
+2024-03-07 08:55:23,888 INFO [streaming_decode.py:615] Cuts processed until now is 2800.
+2024-03-07 08:55:24,270 INFO [streaming_decode.py:615] Cuts processed until now is 2900.
+2024-03-07 08:55:25,106 INFO [streaming_decode.py:615] Cuts processed until now is 3000.
+2024-03-07 08:55:25,958 INFO [streaming_decode.py:615] Cuts processed until now is 3100.
+2024-03-07 08:55:26,804 INFO [streaming_decode.py:615] Cuts processed until now is 3200.
+2024-03-07 08:55:27,480 INFO [streaming_decode.py:615] Cuts processed until now is 3300.
+2024-03-07 08:55:28,327 INFO [streaming_decode.py:615] Cuts processed until now is 3400.
+2024-03-07 08:55:29,199 INFO [streaming_decode.py:615] Cuts processed until now is 3500.
+2024-03-07 08:55:29,882 INFO [streaming_decode.py:615] Cuts processed until now is 3600.
+2024-03-07 08:55:30,749 INFO [streaming_decode.py:615] Cuts processed until now is 3700.
+2024-03-07 08:55:31,445 INFO [streaming_decode.py:615] Cuts processed until now is 3800.
+2024-03-07 08:55:32,285 INFO [streaming_decode.py:615] Cuts processed until now is 3900.
+2024-03-07 08:55:33,146 INFO [streaming_decode.py:615] Cuts processed until now is 4000.
+2024-03-07 08:55:33,839 INFO [streaming_decode.py:615] Cuts processed until now is 4100.
+2024-03-07 08:55:34,687 INFO [streaming_decode.py:615] Cuts processed until now is 4200.
+2024-03-07 08:55:35,556 INFO [streaming_decode.py:615] Cuts processed until now is 4300.
+2024-03-07 08:55:36,242 INFO [streaming_decode.py:615] Cuts processed until now is 4400.
+2024-03-07 08:55:37,118 INFO [streaming_decode.py:615] Cuts processed until now is 4500.
+2024-03-07 08:55:37,989 INFO [streaming_decode.py:615] Cuts processed until now is 4600.
+2024-03-07 08:55:38,683 INFO [streaming_decode.py:615] Cuts processed until now is 4700.
+2024-03-07 08:55:39,534 INFO [streaming_decode.py:615] Cuts processed until now is 4800.
+2024-03-07 08:55:40,399 INFO [streaming_decode.py:615] Cuts processed until now is 4900.
+2024-03-07 08:55:41,089 INFO [streaming_decode.py:615] Cuts processed until now is 5000.
+2024-03-07 08:55:41,951 INFO [streaming_decode.py:615] Cuts processed until now is 5100.
+2024-03-07 08:55:42,815 INFO [streaming_decode.py:615] Cuts processed until now is 5200.
+2024-03-07 08:55:43,506 INFO [streaming_decode.py:615] Cuts processed until now is 5300.
+2024-03-07 08:55:44,345 INFO [streaming_decode.py:615] Cuts processed until now is 5400.
+2024-03-07 08:55:45,218 INFO [streaming_decode.py:615] Cuts processed until now is 5500.
+2024-03-07 08:55:45,911 INFO [streaming_decode.py:615] Cuts processed until now is 5600.
+2024-03-07 08:55:46,761 INFO [streaming_decode.py:615] Cuts processed until now is 5700.
+2024-03-07 08:55:47,641 INFO [streaming_decode.py:615] Cuts processed until now is 5800.
+2024-03-07 08:55:48,336 INFO [streaming_decode.py:615] Cuts processed until now is 5900.
+2024-03-07 08:55:49,182 INFO [streaming_decode.py:615] Cuts processed until now is 6000.
+2024-03-07 08:55:50,051 INFO [streaming_decode.py:615] Cuts processed until now is 6100.
+2024-03-07 08:55:50,744 INFO [streaming_decode.py:615] Cuts processed until now is 6200.
+2024-03-07 08:55:51,596 INFO [streaming_decode.py:615] Cuts processed until now is 6300.
+2024-03-07 08:55:52,469 INFO [streaming_decode.py:615] Cuts processed until now is 6400.
+2024-03-07 08:55:53,170 INFO [streaming_decode.py:615] Cuts processed until now is 6500.
+2024-03-07 08:55:54,019 INFO [streaming_decode.py:615] Cuts processed until now is 6600.
+2024-03-07 08:55:54,889 INFO [streaming_decode.py:615] Cuts processed until now is 6700.
+2024-03-07 08:55:55,582 INFO [streaming_decode.py:615] Cuts processed until now is 6800.
+2024-03-07 08:55:56,430 INFO [streaming_decode.py:615] Cuts processed until now is 6900.
+2024-03-07 08:55:57,304 INFO [streaming_decode.py:615] Cuts processed until now is 7000.
+2024-03-07 08:55:58,003 INFO [streaming_decode.py:615] Cuts processed until now is 7100.
+2024-03-07 08:55:58,859 INFO [streaming_decode.py:615] Cuts processed until now is 7200.
+2024-03-07 08:55:59,737 INFO [streaming_decode.py:615] Cuts processed until now is 7300.
+2024-03-07 08:56:00,438 INFO [streaming_decode.py:615] Cuts processed until now is 7400.
+2024-03-07 08:56:01,288 INFO [streaming_decode.py:615] Cuts processed until now is 7500.
+2024-03-07 08:56:02,163 INFO [streaming_decode.py:615] Cuts processed until now is 7600.
+2024-03-07 08:56:02,862 INFO [streaming_decode.py:615] Cuts processed until now is 7700.
+2024-03-07 08:56:03,730 INFO [streaming_decode.py:615] Cuts processed until now is 7800.
+2024-03-07 08:56:04,615 INFO [streaming_decode.py:615] Cuts processed until now is 7900.
+2024-03-07 08:56:05,298 INFO [streaming_decode.py:615] Cuts processed until now is 8000.
+2024-03-07 08:56:06,159 INFO [streaming_decode.py:615] Cuts processed until now is 8100.
+2024-03-07 08:56:07,036 INFO [streaming_decode.py:615] Cuts processed until now is 8200.
+2024-03-07 08:56:07,727 INFO [streaming_decode.py:615] Cuts processed until now is 8300.
+2024-03-07 08:56:08,586 INFO [streaming_decode.py:615] Cuts processed until now is 8400.
+2024-03-07 08:56:09,472 INFO [streaming_decode.py:615] Cuts processed until now is 8500.
+2024-03-07 08:56:10,163 INFO [streaming_decode.py:615] Cuts processed until now is 8600.
+2024-03-07 08:56:11,024 INFO [streaming_decode.py:615] Cuts processed until now is 8700.
+2024-03-07 08:56:11,913 INFO [streaming_decode.py:615] Cuts processed until now is 8800.
+2024-03-07 08:56:12,600 INFO [streaming_decode.py:615] Cuts processed until now is 8900.
+2024-03-07 08:56:13,467 INFO [streaming_decode.py:615] Cuts processed until now is 9000.
+2024-03-07 08:56:14,370 INFO [streaming_decode.py:615] Cuts processed until now is 9100.
+2024-03-07 08:56:15,069 INFO [streaming_decode.py:615] Cuts processed until now is 9200.
+2024-03-07 08:56:15,959 INFO [streaming_decode.py:615] Cuts processed until now is 9300.
+2024-03-07 08:56:16,866 INFO [streaming_decode.py:615] Cuts processed until now is 9400.
+2024-03-07 08:56:17,556 INFO [streaming_decode.py:615] Cuts processed until now is 9500.
+2024-03-07 08:56:18,460 INFO [streaming_decode.py:615] Cuts processed until now is 9600.
+2024-03-07 08:56:19,157 INFO [streaming_decode.py:615] Cuts processed until now is 9700.
+2024-03-07 08:56:20,036 INFO [streaming_decode.py:615] Cuts processed until now is 9800.
+2024-03-07 08:56:20,439 INFO [streaming_decode.py:615] Cuts processed until now is 9900.
+2024-03-07 08:56:21,347 INFO [streaming_decode.py:615] Cuts processed until now is 10000.
+2024-03-07 08:56:22,519 INFO [streaming_decode.py:615] Cuts processed until now is 10100.
+2024-03-07 08:56:23,219 INFO [streaming_decode.py:615] Cuts processed until now is 10200.
+2024-03-07 08:56:24,092 INFO [streaming_decode.py:615] Cuts processed until now is 10300.
+2024-03-07 08:56:24,984 INFO [streaming_decode.py:615] Cuts processed until now is 10400.
+2024-03-07 08:56:25,673 INFO [streaming_decode.py:615] Cuts processed until now is 10500.
+2024-03-07 08:56:26,535 INFO [streaming_decode.py:615] Cuts processed until now is 10600.
+2024-03-07 08:56:27,422 INFO [streaming_decode.py:615] Cuts processed until now is 10700.
+2024-03-07 08:56:28,110 INFO [streaming_decode.py:615] Cuts processed until now is 10800.
+2024-03-07 08:56:28,983 INFO [streaming_decode.py:615] Cuts processed until now is 10900.
+2024-03-07 08:56:29,887 INFO [streaming_decode.py:615] Cuts processed until now is 11000.
+2024-03-07 08:56:30,582 INFO [streaming_decode.py:615] Cuts processed until now is 11100.
+2024-03-07 08:56:31,468 INFO [streaming_decode.py:615] Cuts processed until now is 11200.
+2024-03-07 08:56:31,870 INFO [streaming_decode.py:615] Cuts processed until now is 11300.
+2024-03-07 08:56:33,076 INFO [streaming_decode.py:615] Cuts processed until now is 11400.
+2024-03-07 08:56:33,986 INFO [streaming_decode.py:615] Cuts processed until now is 11500.
+2024-03-07 08:56:34,684 INFO [streaming_decode.py:615] Cuts processed until now is 11600.
+2024-03-07 08:56:35,077 INFO [streaming_decode.py:615] Cuts processed until now is 11700.
+2024-03-07 08:56:35,970 INFO [streaming_decode.py:615] Cuts processed until now is 11800.
+2024-03-07 08:56:36,873 INFO [streaming_decode.py:615] Cuts processed until now is 11900.
+2024-03-07 08:56:37,568 INFO [streaming_decode.py:615] Cuts processed until now is 12000.
+2024-03-07 08:56:38,745 INFO [streaming_decode.py:615] Cuts processed until now is 12100.
+2024-03-07 08:56:39,618 INFO [streaming_decode.py:615] Cuts processed until now is 12200.
+2024-03-07 08:56:42,452 INFO [streaming_decode.py:660] The transcripts are stored in zipformer/exp-causal/streaming/greedy_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
+2024-03-07 08:56:42,883 INFO [utils.py:656] [test-commonvoice-greedy_search] %WER 7.75% [48447 / 624874, 10530 ins, 21899 del, 16018 sub ]
+2024-03-07 08:56:43,788 INFO [streaming_decode.py:671] Wrote detailed error stats to zipformer/exp-causal/streaming/greedy_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
+2024-03-07 08:56:43,788 INFO [streaming_decode.py:685]
+For test-commonvoice, WER of different settings are:
+greedy_search	7.75	best for test-commonvoice
+2024-03-07 08:56:43,788 INFO [streaming_decode.py:853] Done!

exp-causal/streaming/greedy_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt ADDED Viewed

The diff for this file is too large to render. See raw diff