zipformer-streaming-robust-sw / exp-causal /fast_beam_search /log-decode-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model-2024-03-07-08-39-05
w11wo's picture
Added MOdel
d4ce303
2024-03-07 08:39:05,727 INFO [decode.py:764] Decoding started
2024-03-07 08:39:05,727 INFO [decode.py:770] Device: cuda:0
2024-03-07 08:39:05,727 INFO [lexicon.py:168] Loading pre-compiled data/lang_phone/Linv.pt
2024-03-07 08:39:05,728 INFO [decode.py:778] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.4', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': 'f6919c0ddb311bea7b53a50f3afdcb3c18b8ccc8', 'k2-git-date': 'Sat Feb 10 09:23:09 2024', 'lhotse-version': '1.22.0.dev+git.9355bd72.clean', 'torch-version': '2.0.0+cu118', 'torch-cuda-available': True, 'torch-cuda-version': '11.8', 'python-version': '3.1', 'icefall-git-branch': 'master', 'icefall-git-sha1': 'b35406b0-clean', 'icefall-git-date': 'Thu Mar 7 06:20:34 2024', 'icefall-path': '/root/icefall', 'k2-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/k2/__init__.py', 'lhotse-path': '/root/miniconda3/envs/icefall/lib/python3.10/site-packages/lhotse/__init__.py', 'hostname': 'bookbot-h100', 'IP address': '127.0.0.1'}, 'epoch': 40, 'iter': 0, 'avg': 7, 'use_averaged_model': True, 'exp_dir': PosixPath('zipformer/exp-causal'), 'lang_dir': PosixPath('data/lang_phone'), 'decoding_method': 'fast_beam_search', 'beam_size': 4, 'beam': 20.0, 'ngram_lm_scale': 0.01, 'max_contexts': 8, 'max_states': 64, 'context_size': 2, 'max_sym_per_frame': 1, 'num_paths': 200, 'nbest_scale': 0.5, 'use_shallow_fusion': False, 'lm_type': 'rnn', 'lm_scale': 0.3, 'tokens_ngram': 3, 'backoff_id': 500, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': True, 'chunk_size': '32', 'left_context_frames': '128', 'use_transducer': True, 'use_ctc': True, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'lm_vocab_size': 500, 'lm_epoch': 7, 'lm_avg': 1, 'lm_exp_dir': None, 'rnn_lm_embedding_dim': 2048, 'rnn_lm_hidden_dim': 2048, 'rnn_lm_num_layers': 3, 'rnn_lm_tie_weights': True, 'transformer_lm_exp_dir': None, 'transformer_lm_dim_feedforward': 2048, 'transformer_lm_encoder_dim': 768, 'transformer_lm_embedding_dim': 768, 'transformer_lm_nhead': 8, 'transformer_lm_num_layers': 16, 'transformer_lm_tie_weights': True, 'res_dir': PosixPath('zipformer/exp-causal/fast_beam_search'), 'suffix': 'epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model', 'blank_id': 0, 'unk_id': 19, 'vocab_size': 38}
2024-03-07 08:39:05,728 INFO [decode.py:780] About to create model
2024-03-07 08:39:05,976 INFO [decode.py:847] Calculating the averaged model over epoch range from 33 (excluded) to 40
2024-03-07 08:39:06,839 INFO [decode.py:908] Number of model parameters: 65182863
2024-03-07 08:39:06,839 INFO [multidataset.py:81] About to get FLEURS test cuts
2024-03-07 08:39:06,839 INFO [multidataset.py:83] Loading FLEURS in lazy mode
2024-03-07 08:39:06,839 INFO [multidataset.py:90] About to get Common Voice test cuts
2024-03-07 08:39:06,839 INFO [multidataset.py:92] Loading Common Voice in lazy mode
2024-03-07 08:39:07,885 INFO [decode.py:651] batch 0/?, cuts processed until now is 11
2024-03-07 08:39:09,245 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.2843, 3.2412, 2.7568, 2.6566], device='cuda:0')
2024-03-07 08:39:17,182 INFO [decode.py:651] batch 20/?, cuts processed until now is 270
2024-03-07 08:39:26,436 INFO [decode.py:651] batch 40/?, cuts processed until now is 487
2024-03-07 08:39:26,499 INFO [decode.py:665] The transcripts are stored in zipformer/exp-causal/fast_beam_search/recogs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt
2024-03-07 08:39:26,539 INFO [utils.py:656] [test-fleurs-beam_20.0_max_contexts_8_max_states_64] %WER 6.61% [4072 / 61587, 1548 ins, 1229 del, 1295 sub ]
2024-03-07 08:39:26,632 INFO [decode.py:676] Wrote detailed error stats to zipformer/exp-causal/fast_beam_search/errs-test-fleurs-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt
2024-03-07 08:39:26,633 INFO [decode.py:690]
For test-fleurs, WER of different settings are:
beam_20.0_max_contexts_8_max_states_64 6.61 best for test-fleurs
2024-03-07 08:39:27,522 INFO [decode.py:651] batch 0/?, cuts processed until now is 28
2024-03-07 08:39:31,747 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([2.3396, 2.5719, 2.6157, 2.8669], device='cuda:0')
2024-03-07 08:39:32,952 INFO [decode.py:651] batch 20/?, cuts processed until now is 628
2024-03-07 08:39:38,196 INFO [decode.py:651] batch 40/?, cuts processed until now is 1253
2024-03-07 08:39:43,193 INFO [decode.py:651] batch 60/?, cuts processed until now is 1940
2024-03-07 08:39:45,893 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.1773, 2.6102, 2.7401, 2.4829], device='cuda:0')
2024-03-07 08:39:46,485 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.7060, 2.8868, 2.8695, 2.0508], device='cuda:0')
2024-03-07 08:39:48,212 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([4.4578, 4.5392, 4.1744, 4.1141], device='cuda:0')
2024-03-07 08:39:48,733 INFO [decode.py:651] batch 80/?, cuts processed until now is 2513
2024-03-07 08:39:49,794 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.1706, 2.6006, 2.7836, 2.5051], device='cuda:0')
2024-03-07 08:39:51,031 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([4.5083, 4.5611, 4.1912, 4.1362], device='cuda:0')
2024-03-07 08:39:52,617 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([2.1998, 2.3700, 2.7924, 2.7577], device='cuda:0')
2024-03-07 08:39:53,711 INFO [decode.py:651] batch 100/?, cuts processed until now is 3210
2024-03-07 08:39:59,086 INFO [decode.py:651] batch 120/?, cuts processed until now is 3814
2024-03-07 08:40:04,040 INFO [decode.py:651] batch 140/?, cuts processed until now is 4529
2024-03-07 08:40:08,988 INFO [decode.py:651] batch 160/?, cuts processed until now is 5256
2024-03-07 08:40:09,916 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([4.5227, 4.0025, 4.1148, 4.3108], device='cuda:0')
2024-03-07 08:40:11,126 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.4430, 1.6413, 1.6686, 1.8403, 1.5950, 1.9201, 2.0724, 1.6969],
device='cuda:0')
2024-03-07 08:40:14,084 INFO [decode.py:651] batch 180/?, cuts processed until now is 5927
2024-03-07 08:40:18,458 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.8823, 3.2936, 3.0880, 3.2756], device='cuda:0')
2024-03-07 08:40:19,214 INFO [decode.py:651] batch 200/?, cuts processed until now is 6582
2024-03-07 08:40:22,724 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.5198, 1.6440, 1.5930, 1.8840, 1.5783, 1.8650, 2.0609, 1.6564],
device='cuda:0')
2024-03-07 08:40:24,453 INFO [decode.py:651] batch 220/?, cuts processed until now is 7221
2024-03-07 08:40:29,626 INFO [decode.py:651] batch 240/?, cuts processed until now is 7878
2024-03-07 08:40:33,319 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([2.3668, 2.7808, 2.7429, 2.0601], device='cuda:0')
2024-03-07 08:40:34,832 INFO [decode.py:651] batch 260/?, cuts processed until now is 8528
2024-03-07 08:40:39,681 INFO [decode.py:651] batch 280/?, cuts processed until now is 9263
2024-03-07 08:40:44,602 INFO [decode.py:651] batch 300/?, cuts processed until now is 9972
2024-03-07 08:40:49,952 INFO [decode.py:651] batch 320/?, cuts processed until now is 10574
2024-03-07 08:40:51,376 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([3.0097, 2.9823, 2.5350, 2.4735], device='cuda:0')
2024-03-07 08:40:54,529 INFO [zipformer.py:1858] name=None, attn_weights_entropy = tensor([1.4511, 1.6635, 1.4021, 1.3549, 1.6671, 1.8209, 1.7242, 1.7077],
device='cuda:0')
2024-03-07 08:40:54,975 INFO [decode.py:651] batch 340/?, cuts processed until now is 11255
2024-03-07 08:41:00,108 INFO [decode.py:651] batch 360/?, cuts processed until now is 11900
2024-03-07 08:41:03,743 INFO [decode.py:665] The transcripts are stored in zipformer/exp-causal/fast_beam_search/recogs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt
2024-03-07 08:41:04,136 INFO [utils.py:656] [test-commonvoice-beam_20.0_max_contexts_8_max_states_64] %WER 7.73% [48322 / 624874, 9678 ins, 23552 del, 15092 sub ]
2024-03-07 08:41:04,996 INFO [decode.py:676] Wrote detailed error stats to zipformer/exp-causal/fast_beam_search/errs-test-commonvoice-epoch-40-avg-7-chunk-32-left-context-128-beam-20.0-max-contexts-8-max-states-64-use-averaged-model.txt
2024-03-07 08:41:04,997 INFO [decode.py:690]
For test-commonvoice, WER of different settings are:
beam_20.0_max_contexts_8_max_states_64 7.73 best for test-commonvoice
2024-03-07 08:41:04,997 INFO [decode.py:944] Done!