zipformer-streaming-robust-sw / exp-causal /ctc-decoding /log-decode-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model-2024-03-07-10-59-38
w11wo's picture
Added MOdel
d4ce303
2024-03-07 10:59:38,354 INFO [ctc_decode.py:621] Decoding started
2024-03-07 10:59:38,354 INFO [ctc_decode.py:627] Device: cuda:0
2024-03-07 10:59:38,354 INFO [ctc_decode.py:628] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.4', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': 'f6919c0ddb311bea7b53a50f3afdcb3c18b8ccc8', 'k2-git-date': 'Sat Feb 10 09:23:09 2024', 'lhotse-version': '1.22.0.dev+git.9355bd72.clean', 'torch-version': '2.0.0+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.1', 'icefall-git-branch': 'master', 'icefall-git-sha1': 'b35406b0-dirty', 'icefall-git-date': 'Thu Mar 7 06:20:34 2024', 'icefall-path': '/root/icefall', 'k2-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/k2/__init__.py', 'lhotse-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': 'bookbot-h100', 'IP address': '127.0.0.1'}, 'frame_shift_ms': 10, 'search_beam': 20, 'output_beam': 8, 'min_active_states': 30, 'max_active_states': 10000, 'use_double_scores': True, 'epoch': 40, 'iter': 0, 'avg': 7, 'use_averaged_model': True, 'exp_dir': PosixPath('zipformer/exp-causal'), 'lang_dir': PosixPath('data/lang_phone'), 'context_size': 2, 'decoding_method': 'ctc-decoding', 'num_paths': 100, 'nbest_scale': 1.0, 'hlg_scale': 0.6, 'lm_dir': PosixPath('data/lm'), 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': True, 'chunk_size': '32', 'left_context_frames': '128', 'use_transducer': True, 'use_ctc': True, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'res_dir': PosixPath('zipformer/exp-causal/ctc-decoding'), 'suffix': 'epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model'}
2024-03-07 10:59:38,355 INFO [lexicon.py:168] Loading pre-compiled data/lang_phone/Linv.pt
2024-03-07 10:59:38,522 INFO [ctc_decode.py:701] About to create model
2024-03-07 10:59:38,744 INFO [ctc_decode.py:756] Calculating the averaged model over epoch range from 33 (excluded) to 40
2024-03-07 10:59:39,391 INFO [ctc_decode.py:772] Number of model parameters: 65182863
2024-03-07 10:59:39,392 INFO [multidataset.py:81] About to get FLEURS test cuts
2024-03-07 10:59:39,392 INFO [multidataset.py:83] Loading FLEURS in lazy mode
2024-03-07 10:59:39,392 INFO [multidataset.py:90] About to get Common Voice test cuts
2024-03-07 10:59:39,392 INFO [multidataset.py:92] Loading Common Voice in lazy mode
2024-03-07 10:59:39,992 INFO [ctc_decode.py:542] batch 0/?, cuts processed until now is 11
2024-03-07 10:59:44,584 INFO [ctc_decode.py:556] The transcripts are stored in zipformer/exp-causal/ctc-decoding/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
2024-03-07 10:59:44,625 INFO [utils.py:656] [test-fleurs-ctc-decoding] %WER 6.72% [4137 / 61587, 1757 ins, 1036 del, 1344 sub ]
2024-03-07 10:59:44,719 INFO [ctc_decode.py:565] Wrote detailed error stats to zipformer/exp-causal/ctc-decoding/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
2024-03-07 10:59:44,719 INFO [ctc_decode.py:579]
For test-fleurs, WER of different settings are:
ctc-decoding 6.72 best for test-fleurs
2024-03-07 10:59:45,379 INFO [ctc_decode.py:542] batch 0/?, cuts processed until now is 28
2024-03-07 10:59:52,644 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.4930, 1.6467, 1.4691, 1.4399, 1.6736, 1.6094, 1.5041, 1.7942],
device='cuda:0')
2024-03-07 10:59:53,068 INFO [ctc_decode.py:542] batch 100/?, cuts processed until now is 3210
2024-03-07 10:59:57,567 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.6248, 2.7026, 2.6528, 1.9603], device='cuda:0')
2024-03-07 10:59:58,159 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.1807, 3.0239, 2.3377, 2.7052], device='cuda:0')
2024-03-07 10:59:58,511 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.7430, 1.7438, 1.6404, 1.8630, 1.9495, 1.9508, 2.0193, 1.6035],
device='cuda:0')
2024-03-07 11:00:00,641 INFO [ctc_decode.py:542] batch 200/?, cuts processed until now is 6582
2024-03-07 11:00:08,194 INFO [ctc_decode.py:542] batch 300/?, cuts processed until now is 9972
2024-03-07 11:00:13,903 INFO [ctc_decode.py:556] The transcripts are stored in zipformer/exp-causal/ctc-decoding/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
2024-03-07 11:00:14,287 INFO [utils.py:656] [test-commonvoice-ctc-decoding] %WER 7.78% [48615 / 624874, 12396 ins, 20104 del, 16115 sub ]
2024-03-07 11:00:15,129 INFO [ctc_decode.py:565] Wrote detailed error stats to zipformer/exp-causal/ctc-decoding/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-use-averaged-model.txt
2024-03-07 11:00:15,130 INFO [ctc_decode.py:579]
For test-commonvoice, WER of different settings are:
ctc-decoding 7.78 best for test-commonvoice
2024-03-07 11:00:15,130 INFO [ctc_decode.py:806] Done!