Spaces:
Sleeping
Sleeping
2023-11-01 01:01:00,501 INFO [train.py:851] Training started | |
2023-11-01 01:01:00,502 INFO [train.py:870] Device: cuda:0 | |
2023-11-01 01:01:00,503 INFO [train.py:871] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 100, 'reset_interval': 200, 'valid_interval': 10000, 'world_size': 1, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 20, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('exp/valle_dev'), 'optimizer_name': 'ScaledAdam', 'scheduler_name': 'Eden', 'base_lr': 0.005, 'warmup_steps': 200, 'seed': 42, 'inf_check': False, 'save_every_n': 10000, 'keep_last_k': 20, 'average_period': 0, 'accumulate_grad_steps': 1, 'dtype': 'float16', 'filter_min_duration': 0.0, 'filter_max_duration': 20.0, 'train_stage': 0, 'visualize': False, 'oom_check': True, 'train_dir': '/home/ubuntu/VALL-E-X/JS_Dataset/JS_Dataset/train_tune', 'valid_dir': '/home/ubuntu/VALL-E-X/JS_Dataset/JS_Dataset/valid_tune', 'model_name': 'VALL-E', 'decoder_dim': 1024, 'nhead': 16, 'num_decoder_layers': 12, 'scale_factor': 1.0, 'norm_first': True, 'add_prenet': False, 'prefix_mode': 0, 'share_embedding': True, 'prepend_bos': False, 'num_quantizers': 8, 'scaling_xformers': False} | |
2023-11-01 01:01:00,503 INFO [train.py:873] About to create model | |
2023-11-01 01:01:05,108 INFO [train.py:877] Number of model parameters: 370539524 | |
2023-11-01 01:01:05,334 DEBUG [__init__.py:113] Building prefix dict from the default dictionary ... | |
2023-11-01 01:01:05,334 DEBUG [__init__.py:132] Loading model from cache /tmp/jieba.cache | |
2023-11-01 01:01:05,847 DEBUG [__init__.py:164] Loading model cost 0.513 seconds. | |
2023-11-01 01:01:05,847 DEBUG [__init__.py:166] Prefix dict has been built successfully. | |
2023-11-01 01:01:28,103 INFO [train.py:764] Epoch 1, batch 100, train_loss[loss=3.201, ArTop10Accuracy=0.7404, NarTop10Accuracy=0.6034, over 1306.00 frames. ], tot_loss[loss=3.398, ArTop10Accuracy=0.7056, NarTop10Accuracy=0.5532, over 476.97 frames. ], batch size: 3, lr: 3.75e-03, grad_scale: 1.0 | |
2023-11-01 01:01:49,616 INFO [train.py:764] Epoch 1, batch 200, train_loss[loss=3.527, ArTop10Accuracy=0.6953, NarTop10Accuracy=0.5522, over 1234.00 frames. ], tot_loss[loss=3.408, ArTop10Accuracy=0.709, NarTop10Accuracy=0.552, over 749.45 frames. ], batch size: 3, lr: 5.00e-03, grad_scale: 1.0 | |
2023-11-01 01:02:11,657 INFO [train.py:764] Epoch 1, batch 300, train_loss[loss=3.605, ArTop10Accuracy=0.7095, NarTop10Accuracy=0.4503, over 995.00 frames. ], tot_loss[loss=3.443, ArTop10Accuracy=0.7106, NarTop10Accuracy=0.5387, over 935.67 frames. ], batch size: 2, lr: 5.00e-03, grad_scale: 1.0 | |
2023-11-01 01:02:33,731 INFO [train.py:764] Epoch 1, batch 400, train_loss[loss=3.412, ArTop10Accuracy=0.6864, NarTop10Accuracy=0.5319, over 1234.00 frames. ], tot_loss[loss=3.462, ArTop10Accuracy=0.7132, NarTop10Accuracy=0.5284, over 1040.70 frames. ], batch size: 3, lr: 4.99e-03, grad_scale: 2.0 | |
2023-11-01 01:02:55,462 INFO [train.py:764] Epoch 1, batch 500, train_loss[loss=3.292, ArTop10Accuracy=0.727, NarTop10Accuracy=0.5829, over 1271.00 frames. ], tot_loss[loss=3.483, ArTop10Accuracy=0.7154, NarTop10Accuracy=0.5211, over 1094.80 frames. ], batch size: 3, lr: 4.99e-03, grad_scale: 2.0 | |
2023-11-01 01:03:17,499 INFO [train.py:764] Epoch 1, batch 600, train_loss[loss=3.666, ArTop10Accuracy=0.7166, NarTop10Accuracy=0.4386, over 1496.00 frames. ], tot_loss[loss=3.46, ArTop10Accuracy=0.719, NarTop10Accuracy=0.5268, over 1141.86 frames. ], batch size: 3, lr: 4.98e-03, grad_scale: 2.0 | |
2023-11-01 01:03:39,337 INFO [train.py:764] Epoch 1, batch 700, train_loss[loss=3.255, ArTop10Accuracy=0.6812, NarTop10Accuracy=0.6464, over 1010.00 frames. ], tot_loss[loss=3.468, ArTop10Accuracy=0.7186, NarTop10Accuracy=0.527, over 1167.56 frames. ], batch size: 2, lr: 4.98e-03, grad_scale: 2.0 | |
2023-11-01 01:04:01,250 INFO [train.py:764] Epoch 1, batch 800, train_loss[loss=3.898, ArTop10Accuracy=0.6695, NarTop10Accuracy=0.4247, over 947.00 frames. ], tot_loss[loss=3.467, ArTop10Accuracy=0.7206, NarTop10Accuracy=0.527, over 1184.08 frames. ], batch size: 2, lr: 4.97e-03, grad_scale: 4.0 | |
2023-11-01 01:04:23,115 INFO [train.py:764] Epoch 1, batch 900, train_loss[loss=3.424, ArTop10Accuracy=0.7506, NarTop10Accuracy=0.5276, over 1195.00 frames. ], tot_loss[loss=3.461, ArTop10Accuracy=0.7221, NarTop10Accuracy=0.5286, over 1188.23 frames. ], batch size: 3, lr: 4.96e-03, grad_scale: 4.0 | |
2023-11-01 01:04:45,007 INFO [train.py:764] Epoch 1, batch 1000, train_loss[loss=3.606, ArTop10Accuracy=0.7147, NarTop10Accuracy=0.4905, over 1339.00 frames. ], tot_loss[loss=3.458, ArTop10Accuracy=0.7239, NarTop10Accuracy=0.5285, over 1190.57 frames. ], batch size: 3, lr: 4.95e-03, grad_scale: 4.0 | |
2023-11-01 01:04:45,173 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 3.165e+01 4.527e+01 4.998e+01 5.591e+01 9.306e+01, threshold=9.997e+01, percent-clipped=0.0 | |
2023-11-01 01:05:07,092 INFO [train.py:764] Epoch 1, batch 1100, train_loss[loss=3.756, ArTop10Accuracy=0.7031, NarTop10Accuracy=0.4571, over 1007.00 frames. ], tot_loss[loss=3.465, ArTop10Accuracy=0.7242, NarTop10Accuracy=0.5252, over 1194.69 frames. ], batch size: 2, lr: 4.94e-03, grad_scale: 4.0 | |
2023-11-01 01:05:29,224 INFO [train.py:764] Epoch 1, batch 1200, train_loss[loss=3.244, ArTop10Accuracy=0.7549, NarTop10Accuracy=0.5967, over 1228.00 frames. ], tot_loss[loss=3.456, ArTop10Accuracy=0.7256, NarTop10Accuracy=0.5275, over 1198.98 frames. ], batch size: 3, lr: 4.93e-03, grad_scale: 8.0 | |
2023-11-01 01:05:51,281 INFO [train.py:764] Epoch 1, batch 1300, train_loss[loss=3.368, ArTop10Accuracy=0.7277, NarTop10Accuracy=0.5906, over 1054.00 frames. ], tot_loss[loss=3.464, ArTop10Accuracy=0.7263, NarTop10Accuracy=0.526, over 1202.37 frames. ], batch size: 2, lr: 4.92e-03, grad_scale: 8.0 | |
2023-11-01 01:06:13,118 INFO [train.py:764] Epoch 1, batch 1400, train_loss[loss=3.445, ArTop10Accuracy=0.7333, NarTop10Accuracy=0.5311, over 1301.00 frames. ], tot_loss[loss=3.474, ArTop10Accuracy=0.7277, NarTop10Accuracy=0.5206, over 1198.94 frames. ], batch size: 3, lr: 4.91e-03, grad_scale: 8.0 | |
2023-11-01 01:06:35,108 INFO [train.py:764] Epoch 1, batch 1500, train_loss[loss=3.337, ArTop10Accuracy=0.754, NarTop10Accuracy=0.566, over 1236.00 frames. ], tot_loss[loss=3.46, ArTop10Accuracy=0.7282, NarTop10Accuracy=0.5245, over 1198.26 frames. ], batch size: 3, lr: 4.89e-03, grad_scale: 8.0 | |
2023-11-01 01:06:57,122 INFO [train.py:764] Epoch 1, batch 1600, train_loss[loss=3.626, ArTop10Accuracy=0.7161, NarTop10Accuracy=0.4777, over 1201.00 frames. ], tot_loss[loss=3.464, ArTop10Accuracy=0.7262, NarTop10Accuracy=0.5244, over 1204.65 frames. ], batch size: 3, lr: 4.88e-03, grad_scale: 8.0 | |
2023-11-01 01:07:19,077 INFO [train.py:764] Epoch 1, batch 1700, train_loss[loss=3.4, ArTop10Accuracy=0.7569, NarTop10Accuracy=0.499, over 650.00 frames. ], tot_loss[loss=3.46, ArTop10Accuracy=0.7276, NarTop10Accuracy=0.5238, over 1202.22 frames. ], batch size: 1, lr: 4.87e-03, grad_scale: 8.0 | |
2023-11-01 01:07:40,966 INFO [train.py:764] Epoch 1, batch 1800, train_loss[loss=3.575, ArTop10Accuracy=0.7287, NarTop10Accuracy=0.4463, over 1253.00 frames. ], tot_loss[loss=3.46, ArTop10Accuracy=0.7279, NarTop10Accuracy=0.5244, over 1198.05 frames. ], batch size: 3, lr: 4.85e-03, grad_scale: 8.0 | |
2023-11-01 01:08:02,894 INFO [train.py:764] Epoch 1, batch 1900, train_loss[loss=3.345, ArTop10Accuracy=0.8437, NarTop10Accuracy=0.5034, over 819.00 frames. ], tot_loss[loss=3.453, ArTop10Accuracy=0.7277, NarTop10Accuracy=0.5275, over 1198.59 frames. ], batch size: 1, lr: 4.83e-03, grad_scale: 8.0 | |
2023-11-01 01:08:24,667 INFO [train.py:764] Epoch 1, batch 2000, train_loss[loss=3.321, ArTop10Accuracy=0.7224, NarTop10Accuracy=0.5903, over 1286.00 frames. ], tot_loss[loss=3.442, ArTop10Accuracy=0.7301, NarTop10Accuracy=0.53, over 1193.18 frames. ], batch size: 3, lr: 4.82e-03, grad_scale: 16.0 | |
2023-11-01 01:08:24,840 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.935e+01 4.004e+01 4.278e+01 4.591e+01 1.283e+02, threshold=8.555e+01, percent-clipped=0.1 | |
2023-11-01 01:08:46,652 INFO [train.py:764] Epoch 1, batch 2100, train_loss[loss=3.536, ArTop10Accuracy=0.7298, NarTop10Accuracy=0.5135, over 1114.00 frames. ], tot_loss[loss=3.459, ArTop10Accuracy=0.7287, NarTop10Accuracy=0.5254, over 1201.39 frames. ], batch size: 2, lr: 4.80e-03, grad_scale: 16.0 | |
2023-11-01 01:09:08,857 INFO [train.py:764] Epoch 1, batch 2200, train_loss[loss=3.196, ArTop10Accuracy=0.8063, NarTop10Accuracy=0.545, over 1084.00 frames. ], tot_loss[loss=3.452, ArTop10Accuracy=0.73, NarTop10Accuracy=0.5287, over 1212.86 frames. ], batch size: 2, lr: 4.78e-03, grad_scale: 16.0 | |
2023-11-01 01:09:30,803 INFO [train.py:764] Epoch 1, batch 2300, train_loss[loss=3.648, ArTop10Accuracy=0.7415, NarTop10Accuracy=0.4108, over 1478.00 frames. ], tot_loss[loss=3.457, ArTop10Accuracy=0.7292, NarTop10Accuracy=0.5254, over 1212.39 frames. ], batch size: 3, lr: 4.77e-03, grad_scale: 16.0 | |
2023-11-01 01:09:52,725 INFO [train.py:764] Epoch 1, batch 2400, train_loss[loss=3.594, ArTop10Accuracy=0.7306, NarTop10Accuracy=0.4828, over 1325.00 frames. ], tot_loss[loss=3.43, ArTop10Accuracy=0.7311, NarTop10Accuracy=0.5345, over 1209.83 frames. ], batch size: 3, lr: 4.75e-03, grad_scale: 16.0 | |
2023-11-01 01:10:14,502 INFO [train.py:764] Epoch 1, batch 2500, train_loss[loss=3.158, ArTop10Accuracy=0.7867, NarTop10Accuracy=0.6099, over 1280.00 frames. ], tot_loss[loss=3.419, ArTop10Accuracy=0.7338, NarTop10Accuracy=0.5354, over 1200.35 frames. ], batch size: 3, lr: 4.73e-03, grad_scale: 16.0 | |
2023-11-01 01:10:36,448 INFO [train.py:764] Epoch 1, batch 2600, train_loss[loss=3.217, ArTop10Accuracy=0.7426, NarTop10Accuracy=0.6008, over 1321.00 frames. ], tot_loss[loss=3.417, ArTop10Accuracy=0.7343, NarTop10Accuracy=0.5362, over 1203.85 frames. ], batch size: 3, lr: 4.71e-03, grad_scale: 16.0 | |
2023-11-01 01:10:58,352 INFO [train.py:764] Epoch 1, batch 2700, train_loss[loss=3.172, ArTop10Accuracy=0.748, NarTop10Accuracy=0.624, over 1480.00 frames. ], tot_loss[loss=3.406, ArTop10Accuracy=0.7351, NarTop10Accuracy=0.5391, over 1201.18 frames. ], batch size: 3, lr: 4.69e-03, grad_scale: 16.0 | |
2023-11-01 01:11:20,304 INFO [train.py:764] Epoch 1, batch 2800, train_loss[loss=3.185, ArTop10Accuracy=0.734, NarTop10Accuracy=0.6313, over 1297.00 frames. ], tot_loss[loss=3.409, ArTop10Accuracy=0.7356, NarTop10Accuracy=0.5372, over 1201.74 frames. ], batch size: 3, lr: 4.67e-03, grad_scale: 16.0 | |
2023-11-01 01:11:42,304 INFO [train.py:764] Epoch 1, batch 2900, train_loss[loss=3.039, ArTop10Accuracy=0.7433, NarTop10Accuracy=0.6679, over 1387.00 frames. ], tot_loss[loss=3.421, ArTop10Accuracy=0.7359, NarTop10Accuracy=0.5336, over 1199.06 frames. ], batch size: 3, lr: 4.65e-03, grad_scale: 16.0 | |
2023-11-01 01:12:04,208 INFO [train.py:764] Epoch 1, batch 3000, train_loss[loss=3.229, ArTop10Accuracy=0.7407, NarTop10Accuracy=0.6181, over 1261.00 frames. ], tot_loss[loss=3.428, ArTop10Accuracy=0.7354, NarTop10Accuracy=0.5309, over 1193.63 frames. ], batch size: 3, lr: 4.63e-03, grad_scale: 16.0 | |
2023-11-01 01:12:04,384 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.561e+01 3.787e+01 4.032e+01 4.340e+01 1.223e+02, threshold=8.065e+01, percent-clipped=0.1 | |
2023-11-01 01:12:26,564 INFO [train.py:764] Epoch 1, batch 3100, train_loss[loss=3.249, ArTop10Accuracy=0.7327, NarTop10Accuracy=0.6701, over 1070.00 frames. ], tot_loss[loss=3.42, ArTop10Accuracy=0.7352, NarTop10Accuracy=0.5346, over 1195.52 frames. ], batch size: 2, lr: 4.61e-03, grad_scale: 16.0 | |
2023-11-01 01:12:48,991 INFO [train.py:764] Epoch 1, batch 3200, train_loss[loss=3.381, ArTop10Accuracy=0.7176, NarTop10Accuracy=0.5645, over 1289.00 frames. ], tot_loss[loss=3.418, ArTop10Accuracy=0.7361, NarTop10Accuracy=0.5337, over 1210.89 frames. ], batch size: 3, lr: 4.59e-03, grad_scale: 16.0 | |
2023-11-01 01:13:10,935 INFO [train.py:764] Epoch 1, batch 3300, train_loss[loss=3.741, ArTop10Accuracy=0.7543, NarTop10Accuracy=0.3948, over 1331.00 frames. ], tot_loss[loss=3.427, ArTop10Accuracy=0.7365, NarTop10Accuracy=0.5314, over 1199.90 frames. ], batch size: 3, lr: 4.57e-03, grad_scale: 16.0 | |
2023-11-01 01:13:32,934 INFO [train.py:764] Epoch 1, batch 3400, train_loss[loss=3.206, ArTop10Accuracy=0.7596, NarTop10Accuracy=0.6492, over 1223.00 frames. ], tot_loss[loss=3.425, ArTop10Accuracy=0.7375, NarTop10Accuracy=0.5304, over 1204.36 frames. ], batch size: 3, lr: 4.55e-03, grad_scale: 16.0 | |
2023-11-01 01:13:54,893 INFO [train.py:764] Epoch 1, batch 3500, train_loss[loss=3.203, ArTop10Accuracy=0.7498, NarTop10Accuracy=0.6391, over 1315.00 frames. ], tot_loss[loss=3.417, ArTop10Accuracy=0.7378, NarTop10Accuracy=0.5328, over 1203.43 frames. ], batch size: 3, lr: 4.53e-03, grad_scale: 16.0 | |
2023-11-01 01:14:16,952 INFO [train.py:764] Epoch 1, batch 3600, train_loss[loss=3.249, ArTop10Accuracy=0.7305, NarTop10Accuracy=0.596, over 1002.00 frames. ], tot_loss[loss=3.41, ArTop10Accuracy=0.738, NarTop10Accuracy=0.5372, over 1200.89 frames. ], batch size: 2, lr: 4.50e-03, grad_scale: 16.0 | |
2023-11-01 01:14:38,878 INFO [train.py:764] Epoch 1, batch 3700, train_loss[loss=3.047, ArTop10Accuracy=0.7559, NarTop10Accuracy=0.6173, over 1270.00 frames. ], tot_loss[loss=3.404, ArTop10Accuracy=0.7377, NarTop10Accuracy=0.5375, over 1199.71 frames. ], batch size: 3, lr: 4.48e-03, grad_scale: 16.0 | |
2023-11-01 01:15:00,764 INFO [train.py:764] Epoch 1, batch 3800, train_loss[loss=3.431, ArTop10Accuracy=0.7261, NarTop10Accuracy=0.5124, over 953.00 frames. ], tot_loss[loss=3.422, ArTop10Accuracy=0.737, NarTop10Accuracy=0.5299, over 1206.54 frames. ], batch size: 2, lr: 4.46e-03, grad_scale: 16.0 | |
2023-11-01 01:15:22,762 INFO [train.py:764] Epoch 1, batch 3900, train_loss[loss=3.266, ArTop10Accuracy=0.766, NarTop10Accuracy=0.5619, over 1346.00 frames. ], tot_loss[loss=3.409, ArTop10Accuracy=0.738, NarTop10Accuracy=0.5353, over 1214.09 frames. ], batch size: 3, lr: 4.44e-03, grad_scale: 16.0 | |
2023-11-01 01:15:44,640 INFO [train.py:764] Epoch 1, batch 4000, train_loss[loss=3.197, ArTop10Accuracy=0.7829, NarTop10Accuracy=0.5988, over 1336.00 frames. ], tot_loss[loss=3.396, ArTop10Accuracy=0.7388, NarTop10Accuracy=0.539, over 1202.95 frames. ], batch size: 3, lr: 4.42e-03, grad_scale: 32.0 | |
2023-11-01 01:15:44,805 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.733e+01 3.740e+01 3.969e+01 4.257e+01 1.029e+02, threshold=7.938e+01, percent-clipped=0.2 | |
2023-11-01 01:16:06,416 INFO [train.py:764] Epoch 1, batch 4100, train_loss[loss=3.244, ArTop10Accuracy=0.7525, NarTop10Accuracy=0.5985, over 1277.00 frames. ], tot_loss[loss=3.4, ArTop10Accuracy=0.7391, NarTop10Accuracy=0.5378, over 1197.35 frames. ], batch size: 3, lr: 4.40e-03, grad_scale: 32.0 | |
2023-11-01 01:16:28,445 INFO [train.py:764] Epoch 1, batch 4200, train_loss[loss=3.324, ArTop10Accuracy=0.7329, NarTop10Accuracy=0.5902, over 1228.00 frames. ], tot_loss[loss=3.395, ArTop10Accuracy=0.7386, NarTop10Accuracy=0.5409, over 1204.03 frames. ], batch size: 3, lr: 4.38e-03, grad_scale: 32.0 | |
2023-11-01 01:16:50,432 INFO [train.py:764] Epoch 1, batch 4300, train_loss[loss=3.314, ArTop10Accuracy=0.7644, NarTop10Accuracy=0.5612, over 1125.00 frames. ], tot_loss[loss=3.401, ArTop10Accuracy=0.7396, NarTop10Accuracy=0.5402, over 1201.24 frames. ], batch size: 1, lr: 4.35e-03, grad_scale: 8.0 | |
2023-11-01 01:17:12,468 INFO [train.py:764] Epoch 1, batch 4400, train_loss[loss=2.998, ArTop10Accuracy=0.754, NarTop10Accuracy=0.7196, over 1297.00 frames. ], tot_loss[loss=3.392, ArTop10Accuracy=0.7426, NarTop10Accuracy=0.54, over 1202.84 frames. ], batch size: 3, lr: 4.33e-03, grad_scale: 8.0 | |
2023-11-01 01:17:34,486 INFO [train.py:764] Epoch 1, batch 4500, train_loss[loss=3.158, ArTop10Accuracy=0.7451, NarTop10Accuracy=0.6344, over 1271.00 frames. ], tot_loss[loss=3.377, ArTop10Accuracy=0.7434, NarTop10Accuracy=0.5462, over 1206.43 frames. ], batch size: 3, lr: 4.31e-03, grad_scale: 8.0 | |
2023-11-01 01:17:56,444 INFO [train.py:764] Epoch 1, batch 4600, train_loss[loss=3.468, ArTop10Accuracy=0.7347, NarTop10Accuracy=0.4962, over 980.00 frames. ], tot_loss[loss=3.379, ArTop10Accuracy=0.7433, NarTop10Accuracy=0.5449, over 1199.69 frames. ], batch size: 2, lr: 4.29e-03, grad_scale: 8.0 | |
2023-11-01 01:18:18,361 INFO [train.py:764] Epoch 1, batch 4700, train_loss[loss=3.19, ArTop10Accuracy=0.7266, NarTop10Accuracy=0.6189, over 1280.00 frames. ], tot_loss[loss=3.379, ArTop10Accuracy=0.7423, NarTop10Accuracy=0.545, over 1200.61 frames. ], batch size: 3, lr: 4.27e-03, grad_scale: 8.0 | |
2023-11-01 01:18:40,342 INFO [train.py:764] Epoch 1, batch 4800, train_loss[loss=3.732, ArTop10Accuracy=0.7355, NarTop10Accuracy=0.4267, over 1176.00 frames. ], tot_loss[loss=3.384, ArTop10Accuracy=0.7417, NarTop10Accuracy=0.5435, over 1204.83 frames. ], batch size: 2, lr: 4.25e-03, grad_scale: 8.0 | |
2023-11-01 01:19:02,286 INFO [train.py:764] Epoch 1, batch 4900, train_loss[loss=3.203, ArTop10Accuracy=0.7443, NarTop10Accuracy=0.6233, over 962.00 frames. ], tot_loss[loss=3.396, ArTop10Accuracy=0.7404, NarTop10Accuracy=0.5386, over 1196.82 frames. ], batch size: 2, lr: 4.23e-03, grad_scale: 8.0 | |
2023-11-01 01:19:24,468 INFO [train.py:764] Epoch 1, batch 5000, train_loss[loss=3.96, ArTop10Accuracy=0.6766, NarTop10Accuracy=0.3849, over 1002.00 frames. ], tot_loss[loss=3.4, ArTop10Accuracy=0.74, NarTop10Accuracy=0.5378, over 1198.57 frames. ], batch size: 2, lr: 4.20e-03, grad_scale: 8.0 | |
2023-11-01 01:19:25,064 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.820e+01 3.661e+01 3.895e+01 4.176e+01 5.538e+02, threshold=7.791e+01, percent-clipped=0.3 | |
2023-11-01 01:19:46,481 INFO [train.py:764] Epoch 1, batch 5100, train_loss[loss=3.419, ArTop10Accuracy=0.7219, NarTop10Accuracy=0.5399, over 1050.00 frames. ], tot_loss[loss=3.389, ArTop10Accuracy=0.7419, NarTop10Accuracy=0.5407, over 1202.12 frames. ], batch size: 2, lr: 4.18e-03, grad_scale: 8.0 | |
2023-11-01 01:20:08,364 INFO [train.py:764] Epoch 1, batch 5200, train_loss[loss=3.099, ArTop10Accuracy=0.7485, NarTop10Accuracy=0.6468, over 1352.00 frames. ], tot_loss[loss=3.381, ArTop10Accuracy=0.7422, NarTop10Accuracy=0.5441, over 1199.90 frames. ], batch size: 3, lr: 4.16e-03, grad_scale: 8.0 | |
2023-11-01 01:20:30,120 INFO [train.py:764] Epoch 1, batch 5300, train_loss[loss=3.2, ArTop10Accuracy=0.7527, NarTop10Accuracy=0.573, over 1217.00 frames. ], tot_loss[loss=3.398, ArTop10Accuracy=0.7423, NarTop10Accuracy=0.5382, over 1195.55 frames. ], batch size: 3, lr: 4.14e-03, grad_scale: 8.0 | |
2023-11-01 01:20:52,015 INFO [train.py:764] Epoch 1, batch 5400, train_loss[loss=3.275, ArTop10Accuracy=0.7423, NarTop10Accuracy=0.5832, over 1300.00 frames. ], tot_loss[loss=3.384, ArTop10Accuracy=0.7429, NarTop10Accuracy=0.5434, over 1202.33 frames. ], batch size: 3, lr: 4.12e-03, grad_scale: 8.0 | |
2023-11-01 01:21:13,989 INFO [train.py:764] Epoch 1, batch 5500, train_loss[loss=3.955, ArTop10Accuracy=0.6973, NarTop10Accuracy=0.361, over 1318.00 frames. ], tot_loss[loss=3.391, ArTop10Accuracy=0.7416, NarTop10Accuracy=0.5408, over 1204.76 frames. ], batch size: 3, lr: 4.10e-03, grad_scale: 8.0 | |
2023-11-01 01:21:36,015 INFO [train.py:764] Epoch 1, batch 5600, train_loss[loss=3.215, ArTop10Accuracy=0.7528, NarTop10Accuracy=0.6052, over 1064.00 frames. ], tot_loss[loss=3.389, ArTop10Accuracy=0.7419, NarTop10Accuracy=0.5422, over 1209.03 frames. ], batch size: 2, lr: 4.08e-03, grad_scale: 8.0 | |
2023-11-01 01:21:58,246 INFO [train.py:764] Epoch 1, batch 5700, train_loss[loss=3.757, ArTop10Accuracy=0.7044, NarTop10Accuracy=0.4133, over 1309.00 frames. ], tot_loss[loss=3.403, ArTop10Accuracy=0.7408, NarTop10Accuracy=0.5376, over 1206.83 frames. ], batch size: 3, lr: 4.06e-03, grad_scale: 8.0 | |
2023-11-01 01:22:20,084 INFO [train.py:764] Epoch 1, batch 5800, train_loss[loss=3.222, ArTop10Accuracy=0.7902, NarTop10Accuracy=0.5901, over 1077.00 frames. ], tot_loss[loss=3.393, ArTop10Accuracy=0.7415, NarTop10Accuracy=0.5391, over 1198.69 frames. ], batch size: 2, lr: 4.04e-03, grad_scale: 8.0 | |
2023-11-01 01:22:41,986 INFO [train.py:764] Epoch 1, batch 5900, train_loss[loss=3.256, ArTop10Accuracy=0.7596, NarTop10Accuracy=0.5688, over 1410.00 frames. ], tot_loss[loss=3.379, ArTop10Accuracy=0.7435, NarTop10Accuracy=0.5425, over 1195.35 frames. ], batch size: 2, lr: 4.02e-03, grad_scale: 8.0 | |
2023-11-01 01:23:03,855 INFO [train.py:764] Epoch 1, batch 6000, train_loss[loss=3.04, ArTop10Accuracy=0.7555, NarTop10Accuracy=0.6494, over 1264.00 frames. ], tot_loss[loss=3.367, ArTop10Accuracy=0.7443, NarTop10Accuracy=0.5488, over 1191.87 frames. ], batch size: 3, lr: 4.00e-03, grad_scale: 8.0 | |
2023-11-01 01:23:04,464 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.955e+01 3.660e+01 3.898e+01 4.175e+01 1.106e+02, threshold=7.796e+01, percent-clipped=0.2 | |
2023-11-01 01:23:25,947 INFO [train.py:764] Epoch 1, batch 6100, train_loss[loss=3.244, ArTop10Accuracy=0.7569, NarTop10Accuracy=0.5706, over 1275.00 frames. ], tot_loss[loss=3.366, ArTop10Accuracy=0.7445, NarTop10Accuracy=0.5487, over 1200.31 frames. ], batch size: 3, lr: 3.98e-03, grad_scale: 8.0 | |
2023-11-01 01:23:47,870 INFO [train.py:764] Epoch 1, batch 6200, train_loss[loss=3.64, ArTop10Accuracy=0.7432, NarTop10Accuracy=0.4895, over 1067.00 frames. ], tot_loss[loss=3.368, ArTop10Accuracy=0.7455, NarTop10Accuracy=0.5473, over 1197.29 frames. ], batch size: 2, lr: 3.96e-03, grad_scale: 8.0 | |
2023-11-01 01:24:09,740 INFO [train.py:764] Epoch 1, batch 6300, train_loss[loss=2.884, ArTop10Accuracy=0.7787, NarTop10Accuracy=0.656, over 1229.00 frames. ], tot_loss[loss=3.358, ArTop10Accuracy=0.7462, NarTop10Accuracy=0.5488, over 1195.66 frames. ], batch size: 3, lr: 3.94e-03, grad_scale: 16.0 | |
2023-11-01 01:24:31,832 INFO [train.py:764] Epoch 1, batch 6400, train_loss[loss=3.337, ArTop10Accuracy=0.7198, NarTop10Accuracy=0.6022, over 1342.00 frames. ], tot_loss[loss=3.363, ArTop10Accuracy=0.7452, NarTop10Accuracy=0.5489, over 1200.90 frames. ], batch size: 3, lr: 3.92e-03, grad_scale: 16.0 | |
2023-11-01 01:24:53,712 INFO [train.py:764] Epoch 1, batch 6500, train_loss[loss=3.571, ArTop10Accuracy=0.7508, NarTop10Accuracy=0.4587, over 1324.00 frames. ], tot_loss[loss=3.364, ArTop10Accuracy=0.7447, NarTop10Accuracy=0.5472, over 1203.64 frames. ], batch size: 3, lr: 3.90e-03, grad_scale: 16.0 | |
2023-11-01 01:25:15,706 INFO [train.py:764] Epoch 1, batch 6600, train_loss[loss=3.618, ArTop10Accuracy=0.7208, NarTop10Accuracy=0.4864, over 1157.00 frames. ], tot_loss[loss=3.375, ArTop10Accuracy=0.7441, NarTop10Accuracy=0.544, over 1199.80 frames. ], batch size: 2, lr: 3.89e-03, grad_scale: 16.0 | |
2023-11-01 01:25:37,824 INFO [train.py:764] Epoch 1, batch 6700, train_loss[loss=3.916, ArTop10Accuracy=0.7235, NarTop10Accuracy=0.3773, over 1512.00 frames. ], tot_loss[loss=3.373, ArTop10Accuracy=0.7464, NarTop10Accuracy=0.5436, over 1206.18 frames. ], batch size: 2, lr: 3.87e-03, grad_scale: 16.0 | |
2023-11-01 01:25:59,844 INFO [train.py:764] Epoch 1, batch 6800, train_loss[loss=3.743, ArTop10Accuracy=0.6712, NarTop10Accuracy=0.473, over 1250.00 frames. ], tot_loss[loss=3.378, ArTop10Accuracy=0.747, NarTop10Accuracy=0.5415, over 1211.75 frames. ], batch size: 3, lr: 3.85e-03, grad_scale: 16.0 | |
2023-11-01 01:26:21,673 INFO [train.py:764] Epoch 1, batch 6900, train_loss[loss=3.288, ArTop10Accuracy=0.794, NarTop10Accuracy=0.5091, over 1199.00 frames. ], tot_loss[loss=3.368, ArTop10Accuracy=0.748, NarTop10Accuracy=0.5445, over 1203.83 frames. ], batch size: 3, lr: 3.83e-03, grad_scale: 16.0 | |
2023-11-01 01:26:43,753 INFO [train.py:764] Epoch 1, batch 7000, train_loss[loss=3.611, ArTop10Accuracy=0.6679, NarTop10Accuracy=0.559, over 1054.00 frames. ], tot_loss[loss=3.379, ArTop10Accuracy=0.7463, NarTop10Accuracy=0.5433, over 1205.70 frames. ], batch size: 2, lr: 3.81e-03, grad_scale: 16.0 | |
2023-11-01 01:26:44,377 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.938e+01 3.640e+01 3.899e+01 4.197e+01 1.539e+02, threshold=7.798e+01, percent-clipped=0.2 | |
2023-11-01 01:27:05,623 INFO [train.py:764] Epoch 1, batch 7100, train_loss[loss=3.34, ArTop10Accuracy=0.7641, NarTop10Accuracy=0.5495, over 1221.00 frames. ], tot_loss[loss=3.364, ArTop10Accuracy=0.7483, NarTop10Accuracy=0.5462, over 1200.17 frames. ], batch size: 3, lr: 3.79e-03, grad_scale: 16.0 | |
2023-11-01 01:27:27,790 INFO [train.py:764] Epoch 1, batch 7200, train_loss[loss=3.402, ArTop10Accuracy=0.7728, NarTop10Accuracy=0.522, over 1109.00 frames. ], tot_loss[loss=3.349, ArTop10Accuracy=0.75, NarTop10Accuracy=0.551, over 1200.81 frames. ], batch size: 2, lr: 3.78e-03, grad_scale: 16.0 | |
2023-11-01 01:27:49,946 INFO [train.py:764] Epoch 1, batch 7300, train_loss[loss=3.102, ArTop10Accuracy=0.7641, NarTop10Accuracy=0.645, over 1047.00 frames. ], tot_loss[loss=3.358, ArTop10Accuracy=0.7507, NarTop10Accuracy=0.5483, over 1199.39 frames. ], batch size: 2, lr: 3.76e-03, grad_scale: 16.0 | |
2023-11-01 01:28:12,023 INFO [train.py:764] Epoch 1, batch 7400, train_loss[loss=3.492, ArTop10Accuracy=0.7566, NarTop10Accuracy=0.49, over 1142.00 frames. ], tot_loss[loss=3.35, ArTop10Accuracy=0.7503, NarTop10Accuracy=0.552, over 1204.30 frames. ], batch size: 2, lr: 3.74e-03, grad_scale: 16.0 | |
2023-11-01 01:28:34,245 INFO [train.py:764] Epoch 1, batch 7500, train_loss[loss=3.834, ArTop10Accuracy=0.6968, NarTop10Accuracy=0.4323, over 1019.00 frames. ], tot_loss[loss=3.359, ArTop10Accuracy=0.7492, NarTop10Accuracy=0.5488, over 1211.85 frames. ], batch size: 2, lr: 3.72e-03, grad_scale: 16.0 | |
2023-11-01 01:28:56,204 INFO [train.py:764] Epoch 1, batch 7600, train_loss[loss=3.051, ArTop10Accuracy=0.7471, NarTop10Accuracy=0.6682, over 1538.00 frames. ], tot_loss[loss=3.346, ArTop10Accuracy=0.7511, NarTop10Accuracy=0.5516, over 1203.46 frames. ], batch size: 3, lr: 3.71e-03, grad_scale: 16.0 | |
2023-11-01 01:29:18,414 INFO [train.py:764] Epoch 1, batch 7700, train_loss[loss=3.394, ArTop10Accuracy=0.7467, NarTop10Accuracy=0.508, over 1500.00 frames. ], tot_loss[loss=3.342, ArTop10Accuracy=0.7508, NarTop10Accuracy=0.5528, over 1210.10 frames. ], batch size: 3, lr: 3.69e-03, grad_scale: 16.0 | |
2023-11-01 01:29:40,478 INFO [train.py:764] Epoch 1, batch 7800, train_loss[loss=3.565, ArTop10Accuracy=0.7138, NarTop10Accuracy=0.5097, over 1300.00 frames. ], tot_loss[loss=3.345, ArTop10Accuracy=0.7497, NarTop10Accuracy=0.553, over 1207.39 frames. ], batch size: 3, lr: 3.67e-03, grad_scale: 16.0 | |
2023-11-01 01:30:02,574 INFO [train.py:764] Epoch 1, batch 7900, train_loss[loss=3.114, ArTop10Accuracy=0.7528, NarTop10Accuracy=0.6468, over 1323.00 frames. ], tot_loss[loss=3.348, ArTop10Accuracy=0.7485, NarTop10Accuracy=0.5526, over 1205.69 frames. ], batch size: 3, lr: 3.66e-03, grad_scale: 16.0 | |
2023-11-01 01:30:24,766 INFO [train.py:764] Epoch 1, batch 8000, train_loss[loss=3.416, ArTop10Accuracy=0.7549, NarTop10Accuracy=0.5289, over 1314.00 frames. ], tot_loss[loss=3.344, ArTop10Accuracy=0.7481, NarTop10Accuracy=0.553, over 1204.55 frames. ], batch size: 2, lr: 3.64e-03, grad_scale: 16.0 | |
2023-11-01 01:30:25,379 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.581e+01 3.622e+01 3.870e+01 4.187e+01 1.276e+02, threshold=7.739e+01, percent-clipped=0.3 | |
2023-11-01 01:30:47,002 INFO [train.py:764] Epoch 1, batch 8100, train_loss[loss=3.579, ArTop10Accuracy=0.7254, NarTop10Accuracy=0.493, over 1453.00 frames. ], tot_loss[loss=3.368, ArTop10Accuracy=0.7466, NarTop10Accuracy=0.5466, over 1201.98 frames. ], batch size: 3, lr: 3.62e-03, grad_scale: 16.0 | |
2023-11-01 01:31:09,160 INFO [train.py:764] Epoch 1, batch 8200, train_loss[loss=3.123, ArTop10Accuracy=0.7755, NarTop10Accuracy=0.6341, over 1198.00 frames. ], tot_loss[loss=3.377, ArTop10Accuracy=0.7455, NarTop10Accuracy=0.5452, over 1201.66 frames. ], batch size: 3, lr: 3.61e-03, grad_scale: 16.0 | |
2023-11-01 01:31:31,231 INFO [train.py:764] Epoch 1, batch 8300, train_loss[loss=3.518, ArTop10Accuracy=0.7555, NarTop10Accuracy=0.4488, over 1354.00 frames. ], tot_loss[loss=3.362, ArTop10Accuracy=0.7486, NarTop10Accuracy=0.5461, over 1200.11 frames. ], batch size: 3, lr: 3.59e-03, grad_scale: 32.0 | |
2023-11-01 01:31:53,404 INFO [train.py:764] Epoch 1, batch 8400, train_loss[loss=3.782, ArTop10Accuracy=0.7124, NarTop10Accuracy=0.4265, over 1064.00 frames. ], tot_loss[loss=3.363, ArTop10Accuracy=0.7499, NarTop10Accuracy=0.5457, over 1201.74 frames. ], batch size: 2, lr: 3.58e-03, grad_scale: 32.0 | |
2023-11-01 01:32:15,398 INFO [train.py:764] Epoch 1, batch 8500, train_loss[loss=3.664, ArTop10Accuracy=0.7246, NarTop10Accuracy=0.4731, over 1082.00 frames. ], tot_loss[loss=3.357, ArTop10Accuracy=0.7498, NarTop10Accuracy=0.5488, over 1189.28 frames. ], batch size: 2, lr: 3.56e-03, grad_scale: 32.0 | |
2023-11-01 01:32:37,449 INFO [train.py:764] Epoch 1, batch 8600, train_loss[loss=3.343, ArTop10Accuracy=0.755, NarTop10Accuracy=0.5416, over 1208.00 frames. ], tot_loss[loss=3.362, ArTop10Accuracy=0.7486, NarTop10Accuracy=0.5467, over 1195.17 frames. ], batch size: 3, lr: 3.54e-03, grad_scale: 32.0 | |
2023-11-01 01:32:59,631 INFO [train.py:764] Epoch 1, batch 8700, train_loss[loss=3.506, ArTop10Accuracy=0.7528, NarTop10Accuracy=0.4847, over 1076.00 frames. ], tot_loss[loss=3.341, ArTop10Accuracy=0.7514, NarTop10Accuracy=0.5544, over 1199.46 frames. ], batch size: 2, lr: 3.53e-03, grad_scale: 32.0 | |
2023-11-01 01:33:21,552 INFO [train.py:764] Epoch 1, batch 8800, train_loss[loss=3.168, ArTop10Accuracy=0.7199, NarTop10Accuracy=0.6646, over 1178.00 frames. ], tot_loss[loss=3.338, ArTop10Accuracy=0.751, NarTop10Accuracy=0.5547, over 1196.85 frames. ], batch size: 3, lr: 3.51e-03, grad_scale: 32.0 | |
2023-11-01 01:33:43,552 INFO [train.py:764] Epoch 1, batch 8900, train_loss[loss=3.424, ArTop10Accuracy=0.7301, NarTop10Accuracy=0.5693, over 1330.00 frames. ], tot_loss[loss=3.332, ArTop10Accuracy=0.7518, NarTop10Accuracy=0.5568, over 1197.63 frames. ], batch size: 3, lr: 3.50e-03, grad_scale: 32.0 | |
2023-11-01 01:34:05,651 INFO [train.py:764] Epoch 1, batch 9000, train_loss[loss=3.032, ArTop10Accuracy=0.7749, NarTop10Accuracy=0.6663, over 1355.00 frames. ], tot_loss[loss=3.344, ArTop10Accuracy=0.7508, NarTop10Accuracy=0.5512, over 1206.31 frames. ], batch size: 3, lr: 3.48e-03, grad_scale: 32.0 | |
2023-11-01 01:34:06,281 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.623e+01 3.625e+01 3.891e+01 4.198e+01 1.204e+02, threshold=7.783e+01, percent-clipped=0.2 | |
2023-11-01 01:34:27,643 INFO [train.py:764] Epoch 1, batch 9100, train_loss[loss=3.115, ArTop10Accuracy=0.7458, NarTop10Accuracy=0.6156, over 1255.00 frames. ], tot_loss[loss=3.325, ArTop10Accuracy=0.7508, NarTop10Accuracy=0.5588, over 1204.55 frames. ], batch size: 3, lr: 3.47e-03, grad_scale: 32.0 | |
2023-11-01 01:34:49,396 INFO [train.py:764] Epoch 1, batch 9200, train_loss[loss=3.593, ArTop10Accuracy=0.7426, NarTop10Accuracy=0.4606, over 1243.00 frames. ], tot_loss[loss=3.323, ArTop10Accuracy=0.7518, NarTop10Accuracy=0.5592, over 1199.74 frames. ], batch size: 3, lr: 3.46e-03, grad_scale: 32.0 | |
2023-11-01 01:35:11,250 INFO [train.py:764] Epoch 1, batch 9300, train_loss[loss=3.347, ArTop10Accuracy=0.786, NarTop10Accuracy=0.5172, over 1243.00 frames. ], tot_loss[loss=3.328, ArTop10Accuracy=0.7537, NarTop10Accuracy=0.5565, over 1196.75 frames. ], batch size: 3, lr: 3.44e-03, grad_scale: 32.0 | |
2023-11-01 01:35:33,210 INFO [train.py:764] Epoch 1, batch 9400, train_loss[loss=3.267, ArTop10Accuracy=0.7515, NarTop10Accuracy=0.5882, over 1501.00 frames. ], tot_loss[loss=3.318, ArTop10Accuracy=0.7535, NarTop10Accuracy=0.5602, over 1201.10 frames. ], batch size: 3, lr: 3.43e-03, grad_scale: 32.0 | |
2023-11-01 01:35:55,287 INFO [train.py:764] Epoch 1, batch 9500, train_loss[loss=3.33, ArTop10Accuracy=0.741, NarTop10Accuracy=0.6289, over 1023.00 frames. ], tot_loss[loss=3.318, ArTop10Accuracy=0.7546, NarTop10Accuracy=0.5603, over 1206.75 frames. ], batch size: 2, lr: 3.41e-03, grad_scale: 32.0 | |
2023-11-01 01:36:17,423 INFO [train.py:764] Epoch 1, batch 9600, train_loss[loss=3.327, ArTop10Accuracy=0.7469, NarTop10Accuracy=0.6008, over 1225.00 frames. ], tot_loss[loss=3.326, ArTop10Accuracy=0.7527, NarTop10Accuracy=0.5576, over 1210.68 frames. ], batch size: 3, lr: 3.40e-03, grad_scale: 32.0 | |
2023-11-01 01:36:39,461 INFO [train.py:764] Epoch 1, batch 9700, train_loss[loss=3.695, ArTop10Accuracy=0.7334, NarTop10Accuracy=0.4212, over 1358.00 frames. ], tot_loss[loss=3.33, ArTop10Accuracy=0.7523, NarTop10Accuracy=0.5554, over 1204.10 frames. ], batch size: 3, lr: 3.38e-03, grad_scale: 32.0 | |
2023-11-01 01:37:01,299 INFO [train.py:764] Epoch 1, batch 9800, train_loss[loss=3.36, ArTop10Accuracy=0.7327, NarTop10Accuracy=0.6126, over 1025.00 frames. ], tot_loss[loss=3.328, ArTop10Accuracy=0.7528, NarTop10Accuracy=0.5558, over 1201.65 frames. ], batch size: 2, lr: 3.37e-03, grad_scale: 32.0 | |
2023-11-01 01:37:23,421 INFO [train.py:764] Epoch 1, batch 9900, train_loss[loss=3.719, ArTop10Accuracy=0.7257, NarTop10Accuracy=0.4459, over 1130.00 frames. ], tot_loss[loss=3.331, ArTop10Accuracy=0.7523, NarTop10Accuracy=0.5558, over 1206.02 frames. ], batch size: 2, lr: 3.36e-03, grad_scale: 32.0 | |
2023-11-01 01:37:45,281 INFO [utils.py:237] Saving checkpoint to exp/valle_dev/checkpoint-10000.pt | |
2023-11-01 01:37:54,135 INFO [train.py:764] Epoch 1, batch 10000, train_loss[loss=3.626, ArTop10Accuracy=0.7504, NarTop10Accuracy=0.4714, over 1278.00 frames. ], tot_loss[loss=3.326, ArTop10Accuracy=0.7525, NarTop10Accuracy=0.5572, over 1196.17 frames. ], batch size: 3, lr: 3.34e-03, grad_scale: 32.0 | |
2023-11-01 01:37:54,138 INFO [train.py:802] Computing validation loss | |
2023-11-01 01:41:43,550 INFO [train.py:810] Epoch 1, validation: loss=3.193, ArTop10Accuracy=0.7614, NarTop10Accuracy=0.5796, over 1739106.00 frames. | |
2023-11-01 01:41:43,550 INFO [train.py:813] Maximum memory allocated so far is 17387MB | |
2023-11-01 01:41:44,169 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.795e+01 3.605e+01 3.882e+01 4.179e+01 1.156e+02, threshold=7.765e+01, percent-clipped=0.2 | |
2023-11-01 01:42:05,744 INFO [train.py:764] Epoch 1, batch 10100, train_loss[loss=3.277, ArTop10Accuracy=0.7554, NarTop10Accuracy=0.5287, over 1165.00 frames. ], tot_loss[loss=3.317, ArTop10Accuracy=0.754, NarTop10Accuracy=0.5596, over 1204.24 frames. ], batch size: 3, lr: 3.33e-03, grad_scale: 32.0 | |
2023-11-01 01:42:27,939 INFO [train.py:764] Epoch 1, batch 10200, train_loss[loss=3.551, ArTop10Accuracy=0.7434, NarTop10Accuracy=0.4833, over 1243.00 frames. ], tot_loss[loss=3.316, ArTop10Accuracy=0.7536, NarTop10Accuracy=0.5606, over 1202.58 frames. ], batch size: 3, lr: 3.32e-03, grad_scale: 32.0 | |
2023-11-01 01:42:50,826 INFO [train.py:764] Epoch 1, batch 10300, train_loss[loss=3.3, ArTop10Accuracy=0.7664, NarTop10Accuracy=0.508, over 1049.00 frames. ], tot_loss[loss=3.326, ArTop10Accuracy=0.7522, NarTop10Accuracy=0.5587, over 1209.79 frames. ], batch size: 2, lr: 3.30e-03, grad_scale: 64.0 | |
2023-11-01 01:43:12,842 INFO [train.py:764] Epoch 1, batch 10400, train_loss[loss=3.616, ArTop10Accuracy=0.7065, NarTop10Accuracy=0.5428, over 1237.00 frames. ], tot_loss[loss=3.314, ArTop10Accuracy=0.7537, NarTop10Accuracy=0.5616, over 1195.83 frames. ], batch size: 3, lr: 3.29e-03, grad_scale: 64.0 | |
2023-11-01 01:43:34,961 INFO [train.py:764] Epoch 1, batch 10500, train_loss[loss=3.53, ArTop10Accuracy=0.7334, NarTop10Accuracy=0.5249, over 1343.00 frames. ], tot_loss[loss=3.328, ArTop10Accuracy=0.7528, NarTop10Accuracy=0.5573, over 1201.99 frames. ], batch size: 3, lr: 3.28e-03, grad_scale: 64.0 | |
2023-11-01 01:43:57,011 INFO [train.py:764] Epoch 1, batch 10600, train_loss[loss=3.209, ArTop10Accuracy=0.7266, NarTop10Accuracy=0.6411, over 1262.00 frames. ], tot_loss[loss=3.323, ArTop10Accuracy=0.7559, NarTop10Accuracy=0.5564, over 1204.76 frames. ], batch size: 3, lr: 3.27e-03, grad_scale: 16.0 | |
2023-11-01 01:44:19,318 INFO [train.py:764] Epoch 1, batch 10700, train_loss[loss=3.45, ArTop10Accuracy=0.716, NarTop10Accuracy=0.5662, over 1067.00 frames. ], tot_loss[loss=3.324, ArTop10Accuracy=0.7551, NarTop10Accuracy=0.5563, over 1203.94 frames. ], batch size: 2, lr: 3.25e-03, grad_scale: 16.0 | |
2023-11-01 01:44:42,212 INFO [train.py:764] Epoch 1, batch 10800, train_loss[loss=3.276, ArTop10Accuracy=0.7893, NarTop10Accuracy=0.5161, over 992.00 frames. ], tot_loss[loss=3.311, ArTop10Accuracy=0.7567, NarTop10Accuracy=0.5599, over 1206.94 frames. ], batch size: 2, lr: 3.24e-03, grad_scale: 16.0 | |
2023-11-01 01:45:04,261 INFO [train.py:764] Epoch 1, batch 10900, train_loss[loss=3.293, ArTop10Accuracy=0.7395, NarTop10Accuracy=0.5362, over 833.00 frames. ], tot_loss[loss=3.312, ArTop10Accuracy=0.7552, NarTop10Accuracy=0.5621, over 1197.04 frames. ], batch size: 1, lr: 3.23e-03, grad_scale: 16.0 | |
2023-11-01 01:45:26,606 INFO [train.py:764] Epoch 1, batch 11000, train_loss[loss=3.684, ArTop10Accuracy=0.7243, NarTop10Accuracy=0.4497, over 1023.00 frames. ], tot_loss[loss=3.319, ArTop10Accuracy=0.7556, NarTop10Accuracy=0.5566, over 1192.92 frames. ], batch size: 2, lr: 3.22e-03, grad_scale: 16.0 | |
2023-11-01 01:45:27,631 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.656e+01 3.581e+01 3.876e+01 4.218e+01 2.198e+02, threshold=7.753e+01, percent-clipped=0.5 | |
2023-11-01 01:45:48,570 INFO [train.py:764] Epoch 1, batch 11100, train_loss[loss=3.345, ArTop10Accuracy=0.7677, NarTop10Accuracy=0.5429, over 1218.00 frames. ], tot_loss[loss=3.328, ArTop10Accuracy=0.7565, NarTop10Accuracy=0.5523, over 1188.64 frames. ], batch size: 3, lr: 3.20e-03, grad_scale: 16.0 | |
2023-11-01 01:46:11,057 INFO [train.py:764] Epoch 1, batch 11200, train_loss[loss=3.674, ArTop10Accuracy=0.7002, NarTop10Accuracy=0.4926, over 1451.00 frames. ], tot_loss[loss=3.328, ArTop10Accuracy=0.7558, NarTop10Accuracy=0.5533, over 1199.65 frames. ], batch size: 3, lr: 3.19e-03, grad_scale: 16.0 | |
2023-11-01 01:46:33,210 INFO [train.py:764] Epoch 1, batch 11300, train_loss[loss=3.116, ArTop10Accuracy=0.7731, NarTop10Accuracy=0.6661, over 1516.00 frames. ], tot_loss[loss=3.333, ArTop10Accuracy=0.7557, NarTop10Accuracy=0.5515, over 1198.36 frames. ], batch size: 3, lr: 3.18e-03, grad_scale: 16.0 | |
2023-11-01 01:46:55,584 INFO [train.py:764] Epoch 1, batch 11400, train_loss[loss=3.208, ArTop10Accuracy=0.7723, NarTop10Accuracy=0.6035, over 1010.00 frames. ], tot_loss[loss=3.317, ArTop10Accuracy=0.7566, NarTop10Accuracy=0.5581, over 1207.38 frames. ], batch size: 2, lr: 3.17e-03, grad_scale: 16.0 | |
2023-11-01 01:47:18,045 INFO [train.py:764] Epoch 1, batch 11500, train_loss[loss=2.939, ArTop10Accuracy=0.7846, NarTop10Accuracy=0.6675, over 1114.00 frames. ], tot_loss[loss=3.314, ArTop10Accuracy=0.7566, NarTop10Accuracy=0.5598, over 1217.46 frames. ], batch size: 2, lr: 3.16e-03, grad_scale: 16.0 | |
2023-11-01 01:47:39,987 INFO [train.py:764] Epoch 1, batch 11600, train_loss[loss=3.135, ArTop10Accuracy=0.7496, NarTop10Accuracy=0.6593, over 1178.00 frames. ], tot_loss[loss=3.321, ArTop10Accuracy=0.7559, NarTop10Accuracy=0.558, over 1200.17 frames. ], batch size: 3, lr: 3.15e-03, grad_scale: 16.0 | |
2023-11-01 01:47:56,239 INFO [train.py:648] Reaches end of dataloader. | |
2023-11-01 01:47:56,242 INFO [utils.py:237] Saving checkpoint to exp/valle_dev/epoch-1.pt | |
2023-11-01 01:48:36,133 INFO [train.py:764] Epoch 2, batch 100, train_loss[loss=3.517, ArTop10Accuracy=0.7405, NarTop10Accuracy=0.5475, over 1291.00 frames. ], tot_loss[loss=3.278, ArTop10Accuracy=0.7682, NarTop10Accuracy=0.5618, over 475.14 frames. ], batch size: 3, lr: 3.08e-03, grad_scale: 16.0 | |
2023-11-01 01:48:58,354 INFO [train.py:764] Epoch 2, batch 200, train_loss[loss=3.173, ArTop10Accuracy=0.7737, NarTop10Accuracy=0.5945, over 1264.00 frames. ], tot_loss[loss=3.284, ArTop10Accuracy=0.7679, NarTop10Accuracy=0.5589, over 764.18 frames. ], batch size: 3, lr: 3.07e-03, grad_scale: 16.0 | |
2023-11-01 01:49:20,372 INFO [train.py:764] Epoch 2, batch 300, train_loss[loss=3.15, ArTop10Accuracy=0.7964, NarTop10Accuracy=0.5779, over 1238.00 frames. ], tot_loss[loss=3.279, ArTop10Accuracy=0.7677, NarTop10Accuracy=0.5628, over 933.46 frames. ], batch size: 3, lr: 3.06e-03, grad_scale: 16.0 | |
2023-11-01 01:49:27,316 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.615e+01 3.644e+01 3.906e+01 4.267e+01 1.348e+02, threshold=7.812e+01, percent-clipped=0.3 | |
2023-11-01 01:49:42,335 INFO [train.py:764] Epoch 2, batch 400, train_loss[loss=3.183, ArTop10Accuracy=0.7688, NarTop10Accuracy=0.6139, over 1289.00 frames. ], tot_loss[loss=3.263, ArTop10Accuracy=0.7705, NarTop10Accuracy=0.5659, over 1033.80 frames. ], batch size: 3, lr: 3.05e-03, grad_scale: 16.0 | |
2023-11-01 01:50:04,654 INFO [train.py:764] Epoch 2, batch 500, train_loss[loss=3.224, ArTop10Accuracy=0.7834, NarTop10Accuracy=0.5896, over 974.00 frames. ], tot_loss[loss=3.266, ArTop10Accuracy=0.7696, NarTop10Accuracy=0.5658, over 1103.83 frames. ], batch size: 2, lr: 3.04e-03, grad_scale: 16.0 | |
2023-11-01 01:50:26,894 INFO [train.py:764] Epoch 2, batch 600, train_loss[loss=3.192, ArTop10Accuracy=0.7686, NarTop10Accuracy=0.5814, over 1059.00 frames. ], tot_loss[loss=3.252, ArTop10Accuracy=0.7713, NarTop10Accuracy=0.5709, over 1142.77 frames. ], batch size: 2, lr: 3.02e-03, grad_scale: 16.0 | |
2023-11-01 01:50:49,069 INFO [train.py:764] Epoch 2, batch 700, train_loss[loss=3.272, ArTop10Accuracy=0.7538, NarTop10Accuracy=0.6008, over 1174.00 frames. ], tot_loss[loss=3.253, ArTop10Accuracy=0.7714, NarTop10Accuracy=0.5695, over 1168.33 frames. ], batch size: 3, lr: 3.01e-03, grad_scale: 16.0 | |
2023-11-01 01:51:11,287 INFO [train.py:764] Epoch 2, batch 800, train_loss[loss=3.539, ArTop10Accuracy=0.7563, NarTop10Accuracy=0.4397, over 1227.00 frames. ], tot_loss[loss=3.273, ArTop10Accuracy=0.7692, NarTop10Accuracy=0.5629, over 1178.72 frames. ], batch size: 3, lr: 3.00e-03, grad_scale: 16.0 | |
2023-11-01 01:51:33,655 INFO [train.py:764] Epoch 2, batch 900, train_loss[loss=3.88, ArTop10Accuracy=0.6952, NarTop10Accuracy=0.4376, over 1004.00 frames. ], tot_loss[loss=3.277, ArTop10Accuracy=0.7692, NarTop10Accuracy=0.56, over 1183.23 frames. ], batch size: 2, lr: 2.99e-03, grad_scale: 32.0 | |
2023-11-01 01:51:55,740 INFO [train.py:764] Epoch 2, batch 1000, train_loss[loss=3.467, ArTop10Accuracy=0.7846, NarTop10Accuracy=0.4436, over 1235.00 frames. ], tot_loss[loss=3.28, ArTop10Accuracy=0.768, NarTop10Accuracy=0.5604, over 1186.34 frames. ], batch size: 3, lr: 2.98e-03, grad_scale: 32.0 | |
2023-11-01 01:52:17,749 INFO [train.py:764] Epoch 2, batch 1100, train_loss[loss=3.059, ArTop10Accuracy=0.8348, NarTop10Accuracy=0.57, over 1277.00 frames. ], tot_loss[loss=3.262, ArTop10Accuracy=0.7692, NarTop10Accuracy=0.5665, over 1186.39 frames. ], batch size: 3, lr: 2.97e-03, grad_scale: 32.0 | |
2023-11-01 01:52:40,095 INFO [train.py:764] Epoch 2, batch 1200, train_loss[loss=3.164, ArTop10Accuracy=0.7892, NarTop10Accuracy=0.5847, over 1115.00 frames. ], tot_loss[loss=3.271, ArTop10Accuracy=0.7688, NarTop10Accuracy=0.5645, over 1196.82 frames. ], batch size: 2, lr: 2.96e-03, grad_scale: 32.0 | |
2023-11-01 01:53:02,469 INFO [train.py:764] Epoch 2, batch 1300, train_loss[loss=3.172, ArTop10Accuracy=0.8073, NarTop10Accuracy=0.5677, over 1204.00 frames. ], tot_loss[loss=3.275, ArTop10Accuracy=0.768, NarTop10Accuracy=0.5649, over 1197.53 frames. ], batch size: 3, lr: 2.95e-03, grad_scale: 32.0 | |
2023-11-01 01:53:09,537 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.571e+01 3.646e+01 3.940e+01 4.297e+01 1.030e+02, threshold=7.880e+01, percent-clipped=0.1 | |
2023-11-01 01:53:24,675 INFO [train.py:764] Epoch 2, batch 1400, train_loss[loss=3.376, ArTop10Accuracy=0.7127, NarTop10Accuracy=0.5969, over 1065.00 frames. ], tot_loss[loss=3.259, ArTop10Accuracy=0.7695, NarTop10Accuracy=0.5666, over 1207.86 frames. ], batch size: 2, lr: 2.94e-03, grad_scale: 32.0 | |
2023-11-01 01:53:46,768 INFO [train.py:764] Epoch 2, batch 1500, train_loss[loss=3.237, ArTop10Accuracy=0.7795, NarTop10Accuracy=0.5848, over 1256.00 frames. ], tot_loss[loss=3.255, ArTop10Accuracy=0.7699, NarTop10Accuracy=0.5686, over 1208.58 frames. ], batch size: 3, lr: 2.93e-03, grad_scale: 32.0 | |
2023-11-01 01:54:08,665 INFO [train.py:764] Epoch 2, batch 1600, train_loss[loss=3.242, ArTop10Accuracy=0.726, NarTop10Accuracy=0.6307, over 989.00 frames. ], tot_loss[loss=3.244, ArTop10Accuracy=0.7713, NarTop10Accuracy=0.5715, over 1199.33 frames. ], batch size: 2, lr: 2.92e-03, grad_scale: 32.0 | |
2023-11-01 01:54:30,744 INFO [train.py:764] Epoch 2, batch 1700, train_loss[loss=2.888, ArTop10Accuracy=0.7923, NarTop10Accuracy=0.6888, over 1305.00 frames. ], tot_loss[loss=3.25, ArTop10Accuracy=0.7712, NarTop10Accuracy=0.5707, over 1203.12 frames. ], batch size: 2, lr: 2.91e-03, grad_scale: 32.0 | |
2023-11-01 01:54:53,004 INFO [train.py:764] Epoch 2, batch 1800, train_loss[loss=2.743, ArTop10Accuracy=0.7932, NarTop10Accuracy=0.6948, over 967.00 frames. ], tot_loss[loss=3.27, ArTop10Accuracy=0.7691, NarTop10Accuracy=0.5652, over 1210.34 frames. ], batch size: 2, lr: 2.90e-03, grad_scale: 32.0 | |
2023-11-01 01:55:14,944 INFO [train.py:764] Epoch 2, batch 1900, train_loss[loss=3.528, ArTop10Accuracy=0.7713, NarTop10Accuracy=0.4586, over 1277.00 frames. ], tot_loss[loss=3.274, ArTop10Accuracy=0.7706, NarTop10Accuracy=0.5617, over 1207.27 frames. ], batch size: 3, lr: 2.90e-03, grad_scale: 32.0 | |
2023-11-01 01:55:37,022 INFO [train.py:764] Epoch 2, batch 2000, train_loss[loss=3.354, ArTop10Accuracy=0.7612, NarTop10Accuracy=0.509, over 1030.00 frames. ], tot_loss[loss=3.267, ArTop10Accuracy=0.7721, NarTop10Accuracy=0.5622, over 1205.23 frames. ], batch size: 2, lr: 2.89e-03, grad_scale: 32.0 | |
2023-11-01 01:55:59,142 INFO [train.py:764] Epoch 2, batch 2100, train_loss[loss=3.008, ArTop10Accuracy=0.7989, NarTop10Accuracy=0.6327, over 1253.00 frames. ], tot_loss[loss=3.278, ArTop10Accuracy=0.7713, NarTop10Accuracy=0.5594, over 1207.06 frames. ], batch size: 3, lr: 2.88e-03, grad_scale: 32.0 | |
2023-11-01 01:56:21,029 INFO [train.py:764] Epoch 2, batch 2200, train_loss[loss=3.097, ArTop10Accuracy=0.7775, NarTop10Accuracy=0.6281, over 1272.00 frames. ], tot_loss[loss=3.266, ArTop10Accuracy=0.7719, NarTop10Accuracy=0.563, over 1199.67 frames. ], batch size: 3, lr: 2.87e-03, grad_scale: 32.0 | |
2023-11-01 01:56:43,276 INFO [train.py:764] Epoch 2, batch 2300, train_loss[loss=3.314, ArTop10Accuracy=0.7717, NarTop10Accuracy=0.5371, over 1270.00 frames. ], tot_loss[loss=3.244, ArTop10Accuracy=0.7723, NarTop10Accuracy=0.5711, over 1213.70 frames. ], batch size: 3, lr: 2.86e-03, grad_scale: 32.0 | |
2023-11-01 01:56:50,205 INFO [utils.py:877] Clipping_scale=2.0, grad-norm quartiles 2.446e+01 3.674e+01 3.960e+01 4.425e+01 1.163e+02, threshold=7.920e+01, percent-clipped=0.2 | |
2023-11-01 01:57:05,137 INFO [train.py:764] Epoch 2, batch 2400, train_loss[loss=2.835, ArTop10Accuracy=0.8084, NarTop10Accuracy=0.6297, over 950.00 frames. ], tot_loss[loss=3.257, ArTop10Accuracy=0.7717, NarTop10Accuracy=0.5661, over 1204.28 frames. ], batch size: 2, lr: 2.85e-03, grad_scale: 32.0 | |
2023-11-01 01:57:27,371 INFO [train.py:764] Epoch 2, batch 2500, train_loss[loss=3.096, ArTop10Accuracy=0.7811, NarTop10Accuracy=0.6451, over 1421.00 frames. ], tot_loss[loss=3.266, ArTop10Accuracy=0.771, NarTop10Accuracy=0.5629, over 1206.71 frames. ], batch size: 2, lr: 2.84e-03, grad_scale: 16.0 | |
2023-11-01 01:57:49,515 INFO [train.py:764] Epoch 2, batch 2600, train_loss[loss=3.526, ArTop10Accuracy=0.7525, NarTop10Accuracy=0.499, over 1204.00 frames. ], tot_loss[loss=3.269, ArTop10Accuracy=0.7704, NarTop10Accuracy=0.5633, over 1207.73 frames. ], batch size: 3, lr: 2.83e-03, grad_scale: 16.0 | |
2023-11-01 01:58:11,257 INFO [train.py:764] Epoch 2, batch 2700, train_loss[loss=2.971, ArTop10Accuracy=0.7834, NarTop10Accuracy=0.6313, over 1279.00 frames. ], tot_loss[loss=3.267, ArTop10Accuracy=0.7712, NarTop10Accuracy=0.5641, over 1196.88 frames. ], batch size: 3, lr: 2.82e-03, grad_scale: 16.0 | |