Yhyu13's picture
Upload
02ea771
The following values were not passed to `accelerate launch` and had defaults used instead:
`--num_processes` was set to a value of `2`
More than one GPU was found, enabling multi-GPU training.
If this was unintended please pass in `--num_processes=1`.
`--num_machines` was set to a value of `1`
`--mixed_precision` was set to a value of `'no'`
`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
Using RTX 3090 or 4000 series which doesn't support faster communication speedups. Ensuring P2P and IB communications are disabled.
01/18/2024 18:29:34 - WARNING - llmtuner.model.parser - We recommend enable `upcast_layernorm` in quantized training.
01/18/2024 18:29:34 - WARNING - llmtuner.model.parser - We recommend enable mixed precision training.
01/18/2024 18:29:34 - WARNING - llmtuner.model.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
[INFO|training_args.py:1838] 2024-01-18 18:29:34,925 >> PyTorch: setting up devices
/home/hangyu5/anaconda3/envs/llama_factory/lib/python3.11/site-packages/transformers/training_args.py:1751: FutureWarning: `--push_to_hub_token` is deprecated and will be removed in version 5 of πŸ€— Transformers. Use `--hub_token` instead.
warnings.warn(
01/18/2024 18:29:34 - INFO - llmtuner.model.parser - Process rank: 0, device: cuda:0, n_gpu: 1
distributed training: True, compute dtype: None
01/18/2024 18:29:34 - INFO - llmtuner.model.parser - Training/evaluation parameters Seq2SeqTrainingArguments(
_n_gpu=1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_persistent_workers=False,
dataloader_pin_memory=True,
ddp_backend=None,
ddp_broadcast_buffers=None,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=False,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
dispatch_batches=None,
do_eval=True,
do_predict=False,
do_train=True,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=IntervalStrategy.EPOCH,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
generation_config=None,
generation_max_length=None,
generation_num_beams=None,
gradient_accumulation_steps=4,
gradient_checkpointing=False,
gradient_checkpointing_kwargs=None,
greater_is_better=None,
group_by_length=False,
half_precision_backend=auto,
hub_always_push=False,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=HubStrategy.EVERY_SAVE,
hub_token=<HUB_TOKEN>,
ignore_data_skip=False,
include_inputs_for_metrics=False,
include_num_input_tokens_seen=False,
include_tokens_per_second=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=5e-05,
length_column_name=length,
load_best_model_at_end=False,
local_rank=0,
log_level=passive,
log_level_replica=warning,
log_on_each_node=True,
logging_dir=./models/sft/LMCocktail-10.7B-v1-sft-glaive-function-calling-v2-ep1-lora/runs/Jan18_18-29-34_yhyu13fuwuqi,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=10,
logging_strategy=IntervalStrategy.STEPS,
lr_scheduler_kwargs={},
lr_scheduler_type=SchedulerType.COSINE,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=None,
mp_parameters=,
neftune_noise_alpha=None,
no_cuda=False,
num_train_epochs=1.0,
optim=OptimizerNames.ADAMW_TORCH,
optim_args=None,
output_dir=./models/sft/LMCocktail-10.7B-v1-sft-glaive-function-calling-v2-ep1-lora,
overwrite_output_dir=True,
past_index=-1,
per_device_eval_batch_size=1,
per_device_train_batch_size=1,
predict_with_generate=False,
prediction_loss_only=True,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
ray_scope=last,
remove_unused_columns=True,
report_to=['tensorboard'],
resume_from_checkpoint=None,
run_name=./models/sft/LMCocktail-10.7B-v1-sft-glaive-function-calling-v2-ep1-lora,
save_on_each_node=False,
save_only_model=False,
save_safetensors=True,
save_steps=1000,
save_strategy=IntervalStrategy.STEPS,
save_total_limit=None,
seed=42,
skip_memory_metrics=True,
sortish_sampler=False,
split_batches=False,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_cpu=False,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
)
01/18/2024 18:29:34 - INFO - llmtuner.data.loader - Loading dataset ./glaive-function-calling-v2-llama-factory-convert/simple-function-calling-v2_converted_2000.json...
01/18/2024 18:29:34 - WARNING - llmtuner.data.utils - Checksum failed: missing SHA-1 hash value in dataset_info.json.
01/18/2024 18:29:35 - WARNING - llmtuner.model.parser - We recommend enable `upcast_layernorm` in quantized training.
01/18/2024 18:29:35 - WARNING - llmtuner.model.parser - We recommend enable mixed precision training.
01/18/2024 18:29:35 - WARNING - llmtuner.model.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
/home/hangyu5/anaconda3/envs/llama_factory/lib/python3.11/site-packages/transformers/training_args.py:1751: FutureWarning: `--push_to_hub_token` is deprecated and will be removed in version 5 of πŸ€— Transformers. Use `--hub_token` instead.
warnings.warn(
01/18/2024 18:29:35 - INFO - llmtuner.model.parser - Process rank: 1, device: cuda:1, n_gpu: 1
distributed training: True, compute dtype: None
01/18/2024 18:29:35 - INFO - llmtuner.model.parser - Training/evaluation parameters Seq2SeqTrainingArguments(
_n_gpu=1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_persistent_workers=False,
dataloader_pin_memory=True,
ddp_backend=None,
ddp_broadcast_buffers=None,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=False,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
dispatch_batches=None,
do_eval=True,
do_predict=False,
do_train=True,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=None,
evaluation_strategy=IntervalStrategy.EPOCH,
fp16=False,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False},
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
generation_config=None,
generation_max_length=None,
generation_num_beams=None,
gradient_accumulation_steps=4,
gradient_checkpointing=False,
gradient_checkpointing_kwargs=None,
greater_is_better=None,
group_by_length=False,
half_precision_backend=auto,
hub_always_push=False,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=HubStrategy.EVERY_SAVE,
hub_token=<HUB_TOKEN>,
ignore_data_skip=False,
include_inputs_for_metrics=False,
include_num_input_tokens_seen=False,
include_tokens_per_second=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=5e-05,
length_column_name=length,
load_best_model_at_end=False,
local_rank=1,
log_level=passive,
log_level_replica=warning,
log_on_each_node=True,
logging_dir=./models/sft/LMCocktail-10.7B-v1-sft-glaive-function-calling-v2-ep1-lora/runs/Jan18_18-29-34_yhyu13fuwuqi,
logging_first_step=False,
logging_nan_inf_filter=True,
logging_steps=10,
logging_strategy=IntervalStrategy.STEPS,
lr_scheduler_kwargs={},
lr_scheduler_type=SchedulerType.COSINE,
max_grad_norm=1.0,
max_steps=-1,
metric_for_best_model=None,
mp_parameters=,
neftune_noise_alpha=None,
no_cuda=False,
num_train_epochs=1.0,
optim=OptimizerNames.ADAMW_TORCH,
optim_args=None,
output_dir=./models/sft/LMCocktail-10.7B-v1-sft-glaive-function-calling-v2-ep1-lora,
overwrite_output_dir=True,
past_index=-1,
per_device_eval_batch_size=1,
per_device_train_batch_size=1,
predict_with_generate=False,
prediction_loss_only=True,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
ray_scope=last,
remove_unused_columns=True,
report_to=['tensorboard'],
resume_from_checkpoint=None,
run_name=./models/sft/LMCocktail-10.7B-v1-sft-glaive-function-calling-v2-ep1-lora,
save_on_each_node=False,
save_only_model=False,
save_safetensors=True,
save_steps=1000,
save_strategy=IntervalStrategy.STEPS,
save_total_limit=None,
seed=42,
skip_memory_metrics=True,
sortish_sampler=False,
split_batches=False,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_cpu=False,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
)
01/18/2024 18:29:35 - INFO - llmtuner.data.loader - Loading dataset ./glaive-function-calling-v2-llama-factory-convert/simple-function-calling-v2_converted_2000.json...
01/18/2024 18:29:35 - WARNING - llmtuner.data.utils - Checksum failed: missing SHA-1 hash value in dataset_info.json.
Using custom data configuration default-cb85ddec01d455d4
Loading Dataset Infos from /home/hangyu5/anaconda3/envs/llama_factory/lib/python3.11/site-packages/datasets/packaged_modules/json
Generating dataset json (/home/hangyu5/.cache/huggingface/datasets/json/default-cb85ddec01d455d4/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96)
Downloading and preparing dataset json/default to /home/hangyu5/.cache/huggingface/datasets/json/default-cb85ddec01d455d4/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96...
Downloading took 0.0 min
Checksum Computation took 0.0 min
Generating train split
Generating train split: 0 examples [00:00, ? examples/s]
Generating train split: 6640 examples [00:00, 69564.06 examples/s]
Unable to verify splits sizes.
Dataset json downloaded and prepared to /home/hangyu5/.cache/huggingface/datasets/json/default-cb85ddec01d455d4/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96. Subsequent calls will reuse this data.
[INFO|tokenization_utils_base.py:2024] 2024-01-18 18:29:36,121 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2024] 2024-01-18 18:29:36,121 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2024] 2024-01-18 18:29:36,121 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2024] 2024-01-18 18:29:36,121 >> loading file tokenizerYhyu13/LMCocktail-10.7B-v1
[INFO|tokenization_utils_base.py:2024] 2024-01-18 18:29:36,121 >> loading file tokenizer.json
[INFO|configuration_Yhyu13/LMCocktail-10.7B-v129:36,160 >> loading configuration file ./models/LMCocktail-10.7B-v1/config.json
[INFO|configuration_utils.py:802] 2024-01-18 18:29:36,161 >> Model config LlamaConfig {
"_name_or_path": "./models/LMCocktail-10.7B-v1",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 4096,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 48,
"num_key_value_heads": 8,
"pad_token_id": 2,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.36.2",
"use_cache": true,
"vocab_size": 32000
}
Yhyu13/LMCocktail-10.7B-v1
01/18/2024 18:29:36 - INFO - llmtuner.model.patcher - Quantizing model to 4 bit.
[INFO|modeling_utils.py:3341] 2024-01-18 18:29:36,179 >> loading weights file ./models/LMCocktail-10.7B-v1/model.safetensors.index.json
[INFO|modeling_utils.py:1341] 2024-01-18 18:29:36,179 >> Instantiating LlamaForCausalLM model under default dtype torch.float16.
[INFO|configuration_utils.py:826] 2024-01-18 18:29:36,179 >> Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"pad_token_id": 2
}
01/18/2024 18:29:36 - INFO - llmtuner.model.patcher - Quantizing model to 4 bit.
[INFO|modeling_utils.py:3483] 2024-01-18 18:29:37,052 >> Detected 4-bit loading: activating 4-bit loading for this model
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]Yhyu13/LMCocktail-10.7B-v1
Loading checkpoint shards: 20%|β–ˆβ–ˆ | 1/5 [00:00<00:03, 1.07it/s]
Loading checkpoint shards: 20%|β–ˆβ–ˆ | 1/5 [00:00<00:03, 1.02it/s]Yhyu13/LMCocktail-10.7B-v1
Loading checkpoint shards: 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 2/5 [00:01<00:02, 1.05it/s]
Loading checkpoint shards: 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 2/5 [00:01<00:02, 1.02it/s]
Loading checkpoint shards: 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 3/5 [00:02<00:01, 1.13it/s]
Loading checkpoint shards: 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 3/5 [00:02<00:01, 1.12it/s]
Loading checkpoint shards: 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 4/5 [00:03<00:00, 1.18it/s]
Loading checkpoint shards: 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 4/5 [00:03<00:00, 1.18it/s]
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5/5 [00:03<00:00, 1.46it/s]
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5/5 [00:03<00:00, 1.29it/s]
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5/5 [00:03<00:00, 1.49it/s]
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5/5 [00:03<00:00, 1.28it/s]
[INFO|modeling_utils.py:4185] 2024-01-18 18:29:41,340 >> All model checkpoint weights were used when initializing LlamaForCausalLM.
[INFO|modeling_utils.py:4193] 2024-01-18 18:29:41,340 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint at ./models/LMCocktail-10.7B-v1.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
[INFO|configuration_utils.py:779] 2024-01-18 18:29:41,344 >> loading configuration file ./models/LMCocktail-10.7B-v1/generation_config.json
[INFO|configuration_utils.py:826] 2024-01-18 18:29:41,344 >> Generate config GenerationConfig {
"bos_token_id": 1,
"eos_token_id": 2,
"pad_token_id": 2,
"use_cache": false
}
01/18/2024 18:29:41 - INFO - llmtuner.model.patcher - Gradient checkpointing enabled.
01/18/2024 18:29:41 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA
01/18/2024 18:29:41 - INFO - llmtuner.model.loader - trainable params: 5111808 || all params: 10736635904 || trainable%: 0.0476
01/18/2024 18:29:41 - INFO - llmtuner.model.patcher - Gradient checkpointing enabled.
01/18/2024 18:29:41 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA
01/18/2024 18:29:41 - INFO - llmtuner.model.loader - trainable params: 5111808 || all params: 10736635904 || trainable%: 0.0476
Running tokenizer on dataset: 0%| | 0/6640 [00:00<?, ? examples/s]Caching processed dataset at /home/hangyu5/.cache/huggingface/datasets/json/default-cb85ddec01d455d4/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96/cache-29a204c92e15128c.arrow
Running tokenizer on dataset: 15%|β–ˆβ–Œ | 1000/6640 [00:02<00:13, 433.18 examples/s]
Running tokenizer on dataset: 30%|β–ˆβ–ˆβ–ˆ | 2000/6640 [00:04<00:10, 437.86 examples/s]
Running tokenizer on dataset: 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 3000/6640 [00:06<00:08, 446.46 examples/s]
Running tokenizer on dataset: 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 4000/6640 [00:09<00:05, 442.18 examples/s]
Running tokenizer on dataset: 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 5000/6640 [00:11<00:03, 460.33 examples/s]
Running tokenizer on dataset: 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 6000/6640 [00:13<00:01, 454.61 examples/s]
Running tokenizer on dataset: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6640/6640 [00:14<00:00, 449.05 examples/s]
Running tokenizer on dataset: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6640/6640 [00:14<00:00, 447.75 examples/s]
input_ids:
[1, 774, 1247, 28747, 13, 27842, 28747, 995, 460, 264, 10865, 13892, 395, 2735, 298, 272, 2296, 5572, 28723, 5938, 706, 513, 3030, 387, 13, 28751, 13, 2287, 345, 861, 1264, 345, 527, 28730, 720, 4078, 28730, 6036, 548, 13, 2287, 345, 6518, 1264, 345, 1458, 272, 8877, 4338, 1444, 989, 1191, 951, 20023, 548, 13, 2287, 345, 11438, 1264, 371, 13, 5390, 345, 1123, 1264, 345, 2814, 548, 13, 5390, 345, 10723, 1264, 371, 13, 17422, 345, 2893, 28730, 16714, 1264, 371, 13, 1417, 28705, 345, 1123, 1264, 345, 1427, 548, 13, 1417, 28705, 345, 6518, 1264, 345, 1014, 15547, 298, 6603, 477, 28739, 13, 17422, 1630, 13, 17422, 345, 3731, 28730, 16714, 1264, 371, 13, 1417, 28705, 345, 1123, 1264, 345, 1427, 548, 13, 1417, 28705, 345, 6518, 1264, 345, 1014, 15547, 298, 6603, 298, 28739, 13, 17422, 443, 13, 5390, 1630, 13, 5390, 345, 10893, 1264, 733, 13, 17422, 345, 2893, 28730, 16714, 548, 13, 17422, 345, 3731, 28730, 16714, 28739, 13, 5390, 4709, 13, 2287, 443, 13, 28752, 13, 13, 6325, 368, 1820, 264, 9314, 354, 528, 477, 1450, 2726, 298, 4222, 28804, 13, 13, 27332, 21631, 28747, 13, 315, 28742, 28719, 7371, 28725, 562, 315, 949, 28742, 28707, 506, 272, 21368, 298, 1820, 22447, 28723, 1984, 1868, 908, 5976, 528, 298, 625, 272, 8877, 4338, 1444, 989, 1191, 951, 20023, 28723, 1047, 368, 927, 1316, 395, 369, 28725, 1601, 1933, 298, 1460, 28808, 2]
inputs:
<s> ### User:
SYSTEM: You are a helpful assistant with access to the following functions. Use them if required -
{
"name": "get_exchange_rate",
"description": "Get the exchange rate between two currencies",
"parameters": {
"type": "object",
"properties": {
"base_currency": {
"type": "string",
"description": "The currency to convert from"
},
"target_currency": {
"type": "string",
"description": "The currency to convert to"
}
},
"required": [
"base_currency",
"target_currency"
]
}
}
Can you book a flight for me from New York to London?
### Assistant:
I'm sorry, but I don't have the capability to book flights. My current function allows me to get the exchange rate between two currencies. If you need help with that, feel free to ask!</s>
label_ids:
[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 315, 28742, 28719, 7371, 28725, 562, 315, 949, 28742, 28707, 506, 272, 21368, 298, 1820, 22447, 28723, 1984, 1868, 908, 5976, 528, 298, 625, 272, 8877, 4338, 1444, 989, 1191, 951, 20023, 28723, 1047, 368, 927, 1316, 395, 369, 28725, 1601, 1933, 298, 1460, 28808, 2]
labels:
I'm sorry, but I don't have the capability to book flights. My current function allows me to get the exchange rate between two currencies. If you need help with that, feel free to ask!</s>
[INFO|training_args.py:1838] 2024-01-18 18:29:57,465 >> PyTorch: setting up devices
Running tokenizer on dataset: 0%| | 0/6640 [00:00<?, ? examples/s]/home/hangyu5/anaconda3/envs/llama_factory/lib/python3.11/site-packages/transformers/training_args.py:1751: FutureWarning: `--push_to_hub_token` is deprecated and will be removed in version 5 of πŸ€— Transformers. Use `--hub_token` instead.
warnings.warn(
Caching indices mapping at /home/hangyu5/.cache/huggingface/datasets/json/default-cb85ddec01d455d4/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96/cache-0202cc6fbd77ae6d.arrow
Caching indices mapping at /home/hangyu5/.cache/huggingface/datasets/json/default-cb85ddec01d455d4/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96/cache-b938353c59a96ce3.arrow
Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
Running tokenizer on dataset: 15%|β–ˆβ–Œ | 1000/6640 [00:02<00:13, 430.03 examples/s]
Running tokenizer on dataset: 30%|β–ˆβ–ˆβ–ˆ | 2000/6640 [00:04<00:10, 437.41 examples/s]
Running tokenizer on dataset: 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 3000/6640 [00:06<00:08, 445.02 examples/s]
Running tokenizer on dataset: 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 4000/6640 [00:09<00:06, 439.88 examples/s]
Running tokenizer on dataset: 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 5000/6640 [00:11<00:03, 456.98 examples/s]
Running tokenizer on dataset: 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 6000/6640 [00:13<00:01, 450.87 examples/s]
Running tokenizer on dataset: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6640/6640 [00:14<00:00, 445.30 examples/s]
Running tokenizer on dataset: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6640/6640 [00:14<00:00, 444.75 examples/s]
/home/hangyu5/anaconda3/envs/llama_factory/lib/python3.11/site-packages/transformers/training_args.py:1751: FutureWarning: `--push_to_hub_token` is deprecated and will be removed in version 5 of πŸ€— Transformers. Use `--hub_token` instead.
warnings.warn(
[INFO|trainer.py:1706] 2024-01-18 18:30:12,809 >> ***** Running training *****
[INFO|trainer.py:1707] 2024-01-18 18:30:12,809 >> Num examples = 5,975
[INFO|trainer.py:1708] 2024-01-18 18:30:12,809 >> Num Epochs = 1
[INFO|trainer.py:1709] 2024-01-18 18:30:12,809 >> Instantaneous batch size per device = 1
[INFO|trainer.py:1712] 2024-01-18 18:30:12,809 >> Total train batch size (w. parallel, distributed & accumulation) = 8
[INFO|trainer.py:1713] 2024-01-18 18:30:12,809 >> Gradient Accumulation steps = 4
[INFO|trainer.py:1714] 2024-01-18 18:30:12,809 >> Total optimization steps = 747
[INFO|trainer.py:1715] 2024-01-18 18:30:12,812 >> Number of trainable parameters = 5,111,808
01/18/2024 18:30:14 - WARNING - llmtuner.extras.callbacks - Previous log file in this folder will be deleted.
0%| | 0/747 [00:00<?, ?it/s]/home/hangyu5/anaconda3/envs/llama_factory/lib/python3.11/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/home/hangyu5/anaconda3/envs/llama_factory/lib/python3.11/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
0%| | 1/747 [00:04<54:33, 4.39s/it]
0%| | 2/747 [00:08<49:50, 4.01s/it]
0%| | 3/747 [00:11<45:28, 3.67s/it]
1%| | 4/747 [00:15<46:47, 3.78s/it]
1%| | 5/747 [00:18<45:55, 3.71s/it]
1%| | 6/747 [00:22<44:59, 3.64s/it]
1%| | 7/747 [00:25<42:55, 3.48s/it]
1%| | 8/747 [00:29<43:37, 3.54s/it]
1%| | 9/747 [00:32<42:28, 3.45s/it]
1%|▏ | 10/747 [00:36<43:20, 3.53s/it]
{'loss': 1.6077, 'learning_rate': 4.997789428625975e-05, 'epoch': 0.01}
1%|▏ | 10/747 [00:36<43:20, 3.53s/it]
1%|▏ | 11/747 [00:39<44:03, 3.59s/it]
2%|▏ | 12/747 [00:44<46:28, 3.79s/it]
2%|▏ | 13/747 [00:47<44:55, 3.67s/it]
2%|▏ | 14/747 [00:51<44:30, 3.64s/it]
2%|▏ | 15/747 [00:54<43:19, 3.55s/it]
2%|▏ | 16/747 [00:58<44:17, 3.63s/it]
2%|▏ | 17/747 [01:02<46:13, 3.80s/it]
2%|▏ | 18/747 [01:05<41:44, 3.44s/it]
3%|β–Ž | 19/747 [01:09<46:30, 3.83s/it]
3%|β–Ž | 20/747 [01:13<45:20, 3.74s/it]
{'loss': 0.9415, 'learning_rate': 4.99116162380454e-05, 'epoch': 0.03}
3%|β–Ž | 20/747 [01:13<45:20, 3.74s/it]
3%|β–Ž | 21/747 [01:17<46:03, 3.81s/it]
3%|β–Ž | 22/747 [01:20<44:11, 3.66s/it]
3%|β–Ž | 23/747 [01:24<43:54, 3.64s/it]
3%|β–Ž | 24/747 [01:28<45:13, 3.75s/it]
3%|β–Ž | 25/747 [01:32<45:37, 3.79s/it]
3%|β–Ž | 26/747 [01:35<44:49, 3.73s/it]
4%|β–Ž | 27/747 [01:38<41:58, 3.50s/it]
4%|β–Ž | 28/747 [01:42<41:48, 3.49s/it]
4%|▍ | 29/747 [01:45<41:09, 3.44s/it]
4%|▍ | 30/747 [01:48<39:31, 3.31s/it]
{'loss': 0.6271, 'learning_rate': 4.980128306524183e-05, 'epoch': 0.04}
4%|▍ | 30/747 [01:48<39:31, 3.31s/it]
4%|▍ | 31/747 [01:53<44:09, 3.70s/it]
4%|▍ | 32/747 [01:57<45:09, 3.79s/it]
4%|▍ | 33/747 [02:01<46:56, 3.95s/it]
5%|▍ | 34/747 [02:05<47:15, 3.98s/it]
5%|▍ | 35/747 [02:08<43:38, 3.68s/it]
5%|▍ | 36/747 [02:12<44:24, 3.75s/it]
5%|▍ | 37/747 [02:15<42:45, 3.61s/it]
5%|β–Œ | 38/747 [02:19<43:40, 3.70s/it]
5%|β–Œ | 39/747 [02:23<45:40, 3.87s/it]
5%|β–Œ | 40/747 [02:27<43:28, 3.69s/it]
{'loss': 0.5335, 'learning_rate': 4.964708988733178e-05, 'epoch': 0.05}
5%|β–Œ | 40/747 [02:27<43:28, 3.69s/it]
5%|β–Œ | 41/747 [02:30<42:05, 3.58s/it]
6%|β–Œ | 42/747 [02:33<41:36, 3.54s/it]
6%|β–Œ | 43/747 [02:37<40:13, 3.43s/it]
6%|β–Œ | 44/747 [02:40<40:06, 3.42s/it]
6%|β–Œ | 45/747 [02:44<43:06, 3.68s/it]
6%|β–Œ | 46/747 [02:48<41:41, 3.57s/it]
6%|β–‹ | 47/747 [02:51<40:40, 3.49s/it]
6%|β–‹ | 48/747 [02:54<39:07, 3.36s/it]
7%|β–‹ | 49/747 [02:57<39:12, 3.37s/it]
7%|β–‹ | 50/747 [03:01<41:17, 3.55s/it]
{'loss': 0.3693, 'learning_rate': 4.944930938833535e-05, 'epoch': 0.07}
7%|β–‹ | 50/747 [03:01<41:17, 3.55s/it]
7%|β–‹ | 51/747 [03:05<41:19, 3.56s/it]
7%|β–‹ | 52/747 [03:09<42:08, 3.64s/it]
7%|β–‹ | 53/747 [03:13<45:41, 3.95s/it]
7%|β–‹ | 54/747 [03:18<46:40, 4.04s/it]
7%|β–‹ | 55/747 [03:21<45:02, 3.91s/it]
7%|β–‹ | 56/747 [03:25<45:59, 3.99s/it]
8%|β–Š | 57/747 [03:29<45:45, 3.98s/it]
8%|β–Š | 58/747 [03:32<41:27, 3.61s/it]
8%|β–Š | 59/747 [03:37<44:19, 3.87s/it]
8%|β–Š | 60/747 [03:39<40:38, 3.55s/it]
{'loss': 0.47, 'learning_rate': 4.9208291334580104e-05, 'epoch': 0.08}
8%|β–Š | 60/747 [03:39<40:38, 3.55s/it]
8%|β–Š | 61/747 [03:42<38:14, 3.35s/it]
8%|β–Š | 62/747 [03:46<40:18, 3.53s/it]
8%|β–Š | 63/747 [03:50<42:07, 3.69s/it]
9%|β–Š | 64/747 [03:55<44:29, 3.91s/it]
9%|β–Š | 65/747 [03:59<44:23, 3.91s/it]
9%|β–‰ | 66/747 [04:01<40:05, 3.53s/it]
9%|β–‰ | 67/747 [04:05<39:24, 3.48s/it]
9%|β–‰ | 68/747 [04:08<40:36, 3.59s/it]
9%|β–‰ | 69/747 [04:12<41:55, 3.71s/it]
9%|β–‰ | 70/747 [04:16<40:59, 3.63s/it]
{'loss': 0.4268, 'learning_rate': 4.892446195615423e-05, 'epoch': 0.09}
9%|β–‰ | 70/747 [04:16<40:59, 3.63s/it]
10%|β–‰ | 71/747 [04:19<40:31, 3.60s/it]
10%|β–‰ | 72/747 [04:23<41:52, 3.72s/it]
10%|β–‰ | 73/747 [04:26<39:10, 3.49s/it]
10%|β–‰ | 74/747 [04:30<40:53, 3.65s/it]
10%|β–ˆ | 75/747 [04:35<43:08, 3.85s/it]
10%|β–ˆ | 76/747 [04:39<44:06, 3.94s/it]
10%|β–ˆ | 77/747 [04:43<43:46, 3.92s/it]
10%|β–ˆ | 78/747 [04:47<44:10, 3.96s/it]
11%|β–ˆ | 79/747 [04:50<41:48, 3.76s/it]
11%|β–ˆ | 80/747 [04:53<40:12, 3.62s/it]
{'loss': 0.3653, 'learning_rate': 4.859832319313697e-05, 'epoch': 0.11}
11%|β–ˆ | 80/747 [04:53<40:12, 3.62s/it]
11%|β–ˆ | 81/747 [04:56<38:02, 3.43s/it]
11%|β–ˆ | 82/747 [05:01<42:36, 3.84s/it]
11%|β–ˆ | 83/747 [05:06<44:42, 4.04s/it]
11%|β–ˆ | 84/747 [05:10<44:48, 4.06s/it]
11%|β–ˆβ– | 85/747 [05:13<43:07, 3.91s/it]
12%|β–ˆβ– | 86/747 [05:17<43:07, 3.91s/it]
12%|β–ˆβ– | 87/747 [05:21<42:03, 3.82s/it]
12%|β–ˆβ– | 88/747 [05:24<39:53, 3.63s/it]
12%|β–ˆβ– | 89/747 [05:28<40:23, 3.68s/it]
12%|β–ˆβ– | 90/747 [05:31<39:12, 3.58s/it]
{'loss': 0.3466, 'learning_rate': 4.8230451807939135e-05, 'epoch': 0.12}
12%|β–ˆβ– | 90/747 [05:31<39:12, 3.58s/it]
12%|β–ˆβ– | 91/747 [05:35<39:19, 3.60s/it]
12%|β–ˆβ– | 92/747 [05:38<39:28, 3.62s/it]
12%|β–ˆβ– | 93/747 [05:41<36:37, 3.36s/it]
13%|β–ˆβ–Ž | 94/747 [05:45<37:12, 3.42s/it]
13%|β–ˆβ–Ž | 95/747 [05:48<37:15, 3.43s/it]
13%|β–ˆβ–Ž | 96/747 [05:52<38:25, 3.54s/it]
13%|β–ˆβ–Ž | 97/747 [05:55<37:57, 3.50s/it]
13%|β–ˆβ–Ž | 98/747 [05:59<36:53, 3.41s/it]
13%|β–ˆβ–Ž | 99/747 [06:03<40:22, 3.74s/it]
13%|β–ˆβ–Ž | 100/747 [06:07<41:25, 3.84s/it]
{'loss': 0.3241, 'learning_rate': 4.782149836532345e-05, 'epoch': 0.13}
13%|β–ˆβ–Ž | 100/747 [06:07<41:25, 3.84s/it]
14%|β–ˆβ–Ž | 101/747 [06:12<44:22, 4.12s/it]
14%|β–ˆβ–Ž | 102/747 [06:15<41:30, 3.86s/it]
14%|β–ˆβ– | 103/747 [06:19<41:29, 3.87s/it]
14%|β–ˆβ– | 104/747 [06:22<38:59, 3.64s/it]
14%|β–ˆβ– | 105/747 [06:26<40:53, 3.82s/it]
14%|β–ˆβ– | 106/747 [06:31<42:15, 3.96s/it]
14%|β–ˆβ– | 107/747 [06:37<48:17, 4.53s/it]
14%|β–ˆβ– | 108/747 [06:40<44:45, 4.20s/it]
15%|β–ˆβ– | 109/747 [06:44<43:11, 4.06s/it]
15%|β–ˆβ– | 110/747 [06:47<41:15, 3.89s/it]
{'loss': 0.3763, 'learning_rate': 4.737218608190878e-05, 'epoch': 0.15}
15%|β–ˆβ– | 110/747 [06:47<41:15, 3.89s/it]
15%|β–ˆβ– | 111/747 [06:52<43:19, 4.09s/it]
15%|β–ˆβ– | 112/747 [06:54<38:03, 3.60s/it]
15%|β–ˆβ–Œ | 113/747 [06:58<36:59, 3.50s/it]
15%|β–ˆβ–Œ | 114/747 [07:02<40:07, 3.80s/it]
15%|β–ˆβ–Œ | 115/747 [07:06<40:10, 3.81s/it]
16%|β–ˆβ–Œ | 116/747 [07:09<38:19, 3.64s/it]
16%|β–ˆβ–Œ | 117/747 [07:14<42:37, 4.06s/it]
16%|β–ˆβ–Œ | 118/747 [07:19<43:59, 4.20s/it]
16%|β–ˆβ–Œ | 119/747 [07:22<40:29, 3.87s/it]
16%|β–ˆβ–Œ | 120/747 [07:25<39:00, 3.73s/it]
{'loss': 0.3333, 'learning_rate': 4.6883309547192476e-05, 'epoch': 0.16}
16%|β–ˆβ–Œ | 120/747 [07:25<39:00, 3.73s/it]
16%|β–ˆβ–Œ | 121/747 [07:28<36:38, 3.51s/it]
16%|β–ˆβ–‹ | 122/747 [07:32<35:52, 3.44s/it]
16%|β–ˆβ–‹ | 123/747 [07:35<37:16, 3.58s/it]
17%|β–ˆβ–‹ | 124/747 [07:38<34:36, 3.33s/it]
17%|β–ˆβ–‹ | 125/747 [07:41<33:52, 3.27s/it]
17%|β–ˆβ–‹ | 126/747 [07:45<35:12, 3.40s/it]
17%|β–ˆβ–‹ | 127/747 [07:49<37:30, 3.63s/it]
17%|β–ˆβ–‹ | 128/747 [07:53<36:49, 3.57s/it]
17%|β–ˆβ–‹ | 129/747 [07:58<40:54, 3.97s/it]
17%|β–ˆβ–‹ | 130/747 [08:01<39:22, 3.83s/it]
{'loss': 0.3106, 'learning_rate': 4.635573331835302e-05, 'epoch': 0.17}
17%|β–ˆβ–‹ | 130/747 [08:01<39:22, 3.83s/it]
18%|β–ˆβ–Š | 131/747 [08:05<38:38, 3.76s/it]
18%|β–ˆβ–Š | 132/747 [08:08<37:52, 3.69s/it]
18%|β–ˆβ–Š | 133/747 [08:13<41:27, 4.05s/it]
18%|β–ˆβ–Š | 134/747 [08:17<41:13, 4.04s/it]
18%|β–ˆβ–Š | 135/747 [08:21<39:56, 3.92s/it]
18%|β–ˆβ–Š | 136/747 [08:24<37:38, 3.70s/it]
18%|β–ˆβ–Š | 137/747 [08:28<37:42, 3.71s/it]
18%|β–ˆβ–Š | 138/747 [08:32<39:26, 3.89s/it]
19%|β–ˆβ–Š | 139/747 [08:36<38:49, 3.83s/it]
19%|β–ˆβ–Š | 140/747 [08:39<37:55, 3.75s/it]
{'loss': 0.3241, 'learning_rate': 4.5790390391317675e-05, 'epoch': 0.19}
19%|β–ˆβ–Š | 140/747 [08:39<37:55, 3.75s/it]
19%|β–ˆβ–‰ | 141/747 [08:44<42:15, 4.18s/it]
19%|β–ˆβ–‰ | 142/747 [08:47<37:44, 3.74s/it]
19%|β–ˆβ–‰ | 143/747 [08:50<34:41, 3.45s/it]
19%|β–ˆβ–‰ | 144/747 [08:53<32:34, 3.24s/it]
19%|β–ˆβ–‰ | 145/747 [08:56<33:42, 3.36s/it]
20%|β–ˆβ–‰ | 146/747 [09:00<34:12, 3.41s/it]
20%|β–ˆβ–‰ | 147/747 [09:04<35:27, 3.55s/it]
20%|β–ˆβ–‰ | 148/747 [09:08<36:34, 3.66s/it]
20%|β–ˆβ–‰ | 149/747 [09:11<35:01, 3.51s/it]
20%|β–ˆβ–ˆ | 150/747 [09:15<36:44, 3.69s/it]
{'loss': 0.2781, 'learning_rate': 4.518828055079925e-05, 'epoch': 0.2}
20%|β–ˆβ–ˆ | 150/747 [09:15<36:44, 3.69s/it]
20%|β–ˆβ–ˆ | 151/747 [09:18<36:25, 3.67s/it]
20%|β–ˆβ–ˆ | 152/747 [09:23<38:47, 3.91s/it]
20%|β–ˆβ–ˆ | 153/747 [09:26<36:17, 3.67s/it]
21%|β–ˆβ–ˆ | 154/747 [09:30<37:02, 3.75s/it]
21%|β–ˆβ–ˆ | 155/747 [09:33<36:17, 3.68s/it]
21%|β–ˆβ–ˆ | 156/747 [09:38<38:21, 3.89s/it]
21%|β–ˆβ–ˆ | 157/747 [09:42<39:22, 4.00s/it]
21%|β–ˆβ–ˆ | 158/747 [09:46<38:14, 3.90s/it]
21%|β–ˆβ–ˆβ– | 159/747 [09:49<36:44, 3.75s/it]
21%|β–ˆβ–ˆβ– | 160/747 [09:54<38:24, 3.93s/it]
{'loss': 0.3, 'learning_rate': 4.4550468602219716e-05, 'epoch': 0.21}
21%|β–ˆβ–ˆβ– | 160/747 [09:54<38:24, 3.93s/it]
22%|β–ˆβ–ˆβ– | 161/747 [09:57<38:16, 3.92s/it]
22%|β–ˆβ–ˆβ– | 162/747 [10:01<37:12, 3.82s/it]
22%|β–ˆβ–ˆβ– | 163/747 [10:05<37:25, 3.84s/it]
22%|β–ˆβ–ˆβ– | 164/747 [10:09<37:19, 3.84s/it]
22%|β–ˆβ–ˆβ– | 165/747 [10:12<36:11, 3.73s/it]
22%|β–ˆβ–ˆβ– | 166/747 [10:15<33:07, 3.42s/it]
22%|β–ˆβ–ˆβ– | 167/747 [10:19<36:10, 3.74s/it]
22%|β–ˆβ–ˆβ– | 168/747 [10:23<34:50, 3.61s/it]
23%|β–ˆβ–ˆβ–Ž | 169/747 [10:25<32:16, 3.35s/it]
23%|β–ˆβ–ˆβ–Ž | 170/747 [10:29<33:14, 3.46s/it]
{'loss': 0.3229, 'learning_rate': 4.387808248864751e-05, 'epoch': 0.23}
23%|β–ˆβ–ˆβ–Ž | 170/747 [10:29<33:14, 3.46s/it]
23%|β–ˆβ–ˆβ–Ž | 171/747 [10:32<32:23, 3.37s/it]
23%|β–ˆβ–ˆβ–Ž | 172/747 [10:36<32:28, 3.39s/it]
23%|β–ˆβ–ˆβ–Ž | 173/747 [10:40<34:55, 3.65s/it]
23%|β–ˆβ–ˆβ–Ž | 174/747 [10:44<36:24, 3.81s/it]
23%|β–ˆβ–ˆβ–Ž | 175/747 [10:48<35:55, 3.77s/it]
24%|β–ˆβ–ˆβ–Ž | 176/747 [10:52<36:41, 3.86s/it]
24%|β–ˆβ–ˆβ–Ž | 177/747 [10:55<33:11, 3.49s/it]
24%|β–ˆβ–ˆβ– | 178/747 [10:58<31:57, 3.37s/it]
24%|β–ˆβ–ˆβ– | 179/747 [11:02<33:24, 3.53s/it]
24%|β–ˆβ–ˆβ– | 180/747 [11:05<33:06, 3.50s/it]
{'loss': 0.3347, 'learning_rate': 4.3172311296078595e-05, 'epoch': 0.24}
24%|β–ˆβ–ˆβ– | 180/747 [11:05<33:06, 3.50s/it]
24%|β–ˆβ–ˆβ– | 181/747 [11:08<31:05, 3.30s/it]
24%|β–ˆβ–ˆβ– | 182/747 [11:12<33:42, 3.58s/it]
24%|β–ˆβ–ˆβ– | 183/747 [11:17<36:09, 3.85s/it]
25%|β–ˆβ–ˆβ– | 184/747 [11:19<33:27, 3.57s/it]
25%|β–ˆβ–ˆβ– | 185/747 [11:23<34:03, 3.64s/it]
25%|β–ˆβ–ˆβ– | 186/747 [11:27<34:45, 3.72s/it]
25%|β–ˆβ–ˆβ–Œ | 187/747 [11:30<33:20, 3.57s/it]
25%|β–ˆβ–ˆβ–Œ | 188/747 [11:35<36:27, 3.91s/it]
25%|β–ˆβ–ˆβ–Œ | 189/747 [11:38<34:23, 3.70s/it]
25%|β–ˆβ–ˆβ–Œ | 190/747 [11:42<34:15, 3.69s/it]
{'loss': 0.2897, 'learning_rate': 4.2434403150588895e-05, 'epoch': 0.25}
25%|β–ˆβ–ˆβ–Œ | 190/747 [11:42<34:15, 3.69s/it]
26%|β–ˆβ–ˆβ–Œ | 191/747 [11:45<33:34, 3.62s/it]
26%|β–ˆβ–ˆβ–Œ | 192/747 [11:48<31:47, 3.44s/it]
26%|β–ˆβ–ˆβ–Œ | 193/747 [11:51<30:28, 3.30s/it]
26%|β–ˆβ–ˆβ–Œ | 194/747 [11:55<31:25, 3.41s/it]
26%|β–ˆβ–ˆβ–Œ | 195/747 [11:58<29:39, 3.22s/it]
26%|β–ˆβ–ˆβ–Œ | 196/747 [12:02<30:50, 3.36s/it]
26%|β–ˆβ–ˆβ–‹ | 197/747 [12:05<31:53, 3.48s/it]
27%|β–ˆβ–ˆβ–‹ | 198/747 [12:09<33:28, 3.66s/it]
27%|β–ˆβ–ˆβ–‹ | 199/747 [12:15<38:03, 4.17s/it]
27%|β–ˆβ–ˆβ–‹ | 200/747 [12:18<36:06, 3.96s/it]
{'loss': 0.2951, 'learning_rate': 4.166566301107687e-05, 'epoch': 0.27}
27%|β–ˆβ–ˆβ–‹ | 200/747 [12:18<36:06, 3.96s/it]
27%|β–ˆβ–ˆβ–‹ | 201/747 [12:22<36:42, 4.03s/it]
27%|β–ˆβ–ˆβ–‹ | 202/747 [12:26<34:31, 3.80s/it]
27%|β–ˆβ–ˆβ–‹ | 203/747 [12:28<31:49, 3.51s/it]
27%|β–ˆβ–ˆβ–‹ | 204/747 [12:32<30:31, 3.37s/it]
27%|β–ˆβ–ˆβ–‹ | 205/747 [12:36<34:19, 3.80s/it]
28%|β–ˆβ–ˆβ–Š | 206/747 [12:40<34:05, 3.78s/it]
28%|β–ˆβ–ˆβ–Š | 207/747 [12:43<32:30, 3.61s/it]
28%|β–ˆβ–ˆβ–Š | 208/747 [12:48<34:48, 3.88s/it]
28%|β–ˆβ–ˆβ–Š | 209/747 [12:52<35:24, 3.95s/it]
28%|β–ˆβ–ˆβ–Š | 210/747 [12:56<34:42, 3.88s/it]
{'loss': 0.3081, 'learning_rate': 4.08674503614997e-05, 'epoch': 0.28}
28%|β–ˆβ–ˆβ–Š | 210/747 [12:56<34:42, 3.88s/it]
28%|β–ˆβ–ˆβ–Š | 211/747 [12:58<31:40, 3.55s/it]
28%|β–ˆβ–ˆβ–Š | 212/747 [13:02<32:41, 3.67s/it]
29%|β–ˆβ–ˆβ–Š | 213/747 [13:06<33:28, 3.76s/it]
29%|β–ˆβ–ˆβ–Š | 214/747 [13:09<31:33, 3.55s/it]
29%|β–ˆβ–ˆβ–‰ | 215/747 [13:12<29:31, 3.33s/it]
29%|β–ˆβ–ˆβ–‰ | 216/747 [13:15<27:28, 3.11s/it]
29%|β–ˆβ–ˆβ–‰ | 217/747 [13:18<27:45, 3.14s/it]
29%|β–ˆβ–ˆβ–‰ | 218/747 [13:22<29:51, 3.39s/it]
29%|β–ˆβ–ˆβ–‰ | 219/747 [13:25<29:47, 3.39s/it]
29%|β–ˆβ–ˆβ–‰ | 220/747 [13:29<30:09, 3.43s/it]
{'loss': 0.2889, 'learning_rate': 4.004117680668422e-05, 'epoch': 0.29}
29%|β–ˆβ–ˆβ–‰ | 220/747 [13:29<30:09, 3.43s/it]
30%|β–ˆβ–ˆβ–‰ | 221/747 [13:33<32:02, 3.66s/it]
30%|β–ˆβ–ˆβ–‰ | 222/747 [13:37<32:55, 3.76s/it]
30%|β–ˆβ–ˆβ–‰ | 223/747 [13:40<30:43, 3.52s/it]
30%|β–ˆβ–ˆβ–‰ | 224/747 [13:44<31:34, 3.62s/it]
30%|β–ˆβ–ˆβ–ˆ | 225/747 [13:47<30:05, 3.46s/it]
30%|β–ˆβ–ˆβ–ˆ | 226/747 [13:51<31:34, 3.64s/it]
30%|β–ˆβ–ˆβ–ˆ | 227/747 [13:55<31:12, 3.60s/it]
31%|β–ˆβ–ˆβ–ˆ | 228/747 [13:58<30:46, 3.56s/it]
31%|β–ˆβ–ˆβ–ˆ | 229/747 [14:02<32:10, 3.73s/it]
31%|β–ˆβ–ˆβ–ˆ | 230/747 [14:05<30:22, 3.53s/it]
{'loss': 0.3207, 'learning_rate': 3.918830357596434e-05, 'epoch': 0.31}
31%|β–ˆβ–ˆβ–ˆ | 230/747 [14:05<30:22, 3.53s/it]
31%|β–ˆβ–ˆβ–ˆ | 231/747 [14:09<29:49, 3.47s/it]
31%|β–ˆβ–ˆβ–ˆ | 232/747 [14:11<27:52, 3.25s/it]
31%|β–ˆβ–ˆβ–ˆ | 233/747 [14:15<28:27, 3.32s/it]
31%|β–ˆβ–ˆβ–ˆβ– | 234/747 [14:19<31:13, 3.65s/it]
31%|β–ˆβ–ˆβ–ˆβ– | 235/747 [14:23<30:54, 3.62s/it]
32%|β–ˆβ–ˆβ–ˆβ– | 236/747 [14:26<30:58, 3.64s/it]
32%|β–ˆβ–ˆβ–ˆβ– | 237/747 [14:30<30:51, 3.63s/it]
32%|β–ˆβ–ˆβ–ˆβ– | 238/747 [14:33<29:34, 3.49s/it]
32%|β–ˆβ–ˆβ–ˆβ– | 239/747 [14:37<30:50, 3.64s/it]
32%|β–ˆβ–ˆβ–ˆβ– | 240/747 [14:41<31:15, 3.70s/it]
{'loss': 0.3238, 'learning_rate': 3.8310338939059644e-05, 'epoch': 0.32}
32%|β–ˆβ–ˆβ–ˆβ– | 240/747 [14:41<31:15, 3.70s/it]
32%|β–ˆβ–ˆβ–ˆβ– | 241/747 [14:44<29:48, 3.54s/it]
32%|β–ˆβ–ˆβ–ˆβ– | 242/747 [14:48<30:35, 3.64s/it]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 243/747 [14:52<30:21, 3.61s/it]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 244/747 [14:55<28:37, 3.41s/it]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 245/747 [14:59<31:20, 3.75s/it]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 246/747 [15:02<29:54, 3.58s/it]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 247/747 [15:07<32:33, 3.91s/it]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 248/747 [15:11<31:53, 3.83s/it]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 249/747 [15:14<31:42, 3.82s/it]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 250/747 [15:19<32:55, 3.97s/it]
{'loss': 0.3146, 'learning_rate': 3.740883553876515e-05, 'epoch': 0.33}
33%|β–ˆβ–ˆβ–ˆβ–Ž | 250/747 [15:19<32:55, 3.97s/it]
34%|β–ˆβ–ˆβ–ˆβ–Ž | 251/747 [15:21<29:56, 3.62s/it]
34%|β–ˆβ–ˆβ–ˆβ–Ž | 252/747 [15:25<29:02, 3.52s/it]
34%|β–ˆβ–ˆβ–ˆβ– | 253/747 [15:29<30:12, 3.67s/it]
34%|β–ˆβ–ˆβ–ˆβ– | 254/747 [15:33<31:18, 3.81s/it]
34%|β–ˆβ–ˆβ–ˆβ– | 255/747 [15:36<29:19, 3.58s/it]
34%|β–ˆβ–ˆβ–ˆβ– | 256/747 [15:40<29:38, 3.62s/it]
34%|β–ˆβ–ˆβ–ˆβ– | 257/747 [15:43<28:42, 3.52s/it]
35%|β–ˆβ–ˆβ–ˆβ– | 258/747 [15:46<28:25, 3.49s/it]
35%|β–ˆβ–ˆβ–ˆβ– | 259/747 [15:51<30:08, 3.71s/it]
35%|β–ˆβ–ˆβ–ˆβ– | 260/747 [15:53<27:24, 3.38s/it]
{'loss': 0.2973, 'learning_rate': 3.6485387645169064e-05, 'epoch': 0.35}
35%|β–ˆβ–ˆβ–ˆβ– | 260/747 [15:53<27:24, 3.38s/it]
35%|β–ˆβ–ˆβ–ˆβ– | 261/747 [15:56<26:44, 3.30s/it]
35%|β–ˆβ–ˆβ–ˆβ–Œ | 262/747 [16:01<30:54, 3.82s/it]
35%|β–ˆβ–ˆβ–ˆβ–Œ | 263/747 [16:06<31:42, 3.93s/it]
35%|β–ˆβ–ˆβ–ˆβ–Œ | 264/747 [16:10<31:44, 3.94s/it]
35%|β–ˆβ–ˆβ–ˆβ–Œ | 265/747 [16:13<30:03, 3.74s/it]
36%|β–ˆβ–ˆβ–ˆβ–Œ | 266/747 [16:16<29:29, 3.68s/it]
36%|β–ˆβ–ˆβ–ˆβ–Œ | 267/747 [16:19<27:48, 3.48s/it]
36%|β–ˆβ–ˆβ–ˆβ–Œ | 268/747 [16:22<26:56, 3.38s/it]
36%|β–ˆβ–ˆβ–ˆβ–Œ | 269/747 [16:28<31:05, 3.90s/it]
36%|β–ˆβ–ˆβ–ˆβ–Œ | 270/747 [16:31<30:10, 3.80s/it]
{'loss': 0.2631, 'learning_rate': 3.55416283362546e-05, 'epoch': 0.36}
36%|β–ˆβ–ˆβ–ˆβ–Œ | 270/747 [16:31<30:10, 3.80s/it]
36%|β–ˆβ–ˆβ–ˆβ–‹ | 271/747 [16:34<28:19, 3.57s/it]
36%|β–ˆβ–ˆβ–ˆβ–‹ | 272/747 [16:38<29:35, 3.74s/it]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 273/747 [16:43<30:40, 3.88s/it]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 274/747 [16:47<31:42, 4.02s/it]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 275/747 [16:50<30:05, 3.82s/it]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 276/747 [16:53<27:42, 3.53s/it]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 277/747 [16:57<28:02, 3.58s/it]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 278/747 [17:01<29:24, 3.76s/it]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 279/747 [17:06<32:38, 4.19s/it]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 280/747 [17:10<31:54, 4.10s/it]
{'loss': 0.3323, 'learning_rate': 3.457922660987155e-05, 'epoch': 0.37}
37%|β–ˆβ–ˆβ–ˆβ–‹ | 280/747 [17:10<31:54, 4.10s/it]
38%|β–ˆβ–ˆβ–ˆβ–Š | 281/747 [17:14<30:23, 3.91s/it]
38%|β–ˆβ–ˆβ–ˆβ–Š | 282/747 [17:16<28:03, 3.62s/it]
38%|β–ˆβ–ˆβ–ˆβ–Š | 283/747 [17:19<26:24, 3.41s/it]
38%|β–ˆβ–ˆβ–ˆβ–Š | 284/747 [17:23<27:14, 3.53s/it]
38%|β–ˆβ–ˆβ–ˆβ–Š | 285/747 [17:28<30:37, 3.98s/it]
38%|β–ˆβ–ˆβ–ˆβ–Š | 286/747 [17:31<28:22, 3.69s/it]
38%|β–ˆβ–ˆβ–ˆβ–Š | 287/747 [17:35<28:54, 3.77s/it]
39%|β–ˆβ–ˆβ–ˆβ–Š | 288/747 [17:39<29:30, 3.86s/it]
39%|β–ˆβ–ˆβ–ˆβ–Š | 289/747 [17:44<31:03, 4.07s/it]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 290/747 [17:47<29:35, 3.89s/it]
{'loss': 0.2631, 'learning_rate': 3.3599884432185225e-05, 'epoch': 0.39}
39%|β–ˆβ–ˆβ–ˆβ–‰ | 290/747 [17:47<29:35, 3.89s/it]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 291/747 [17:50<27:37, 3.64s/it]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 292/747 [17:54<27:43, 3.66s/it]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 293/747 [17:58<27:38, 3.65s/it]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 294/747 [18:02<28:30, 3.78s/it]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 295/747 [18:05<28:08, 3.74s/it]
40%|β–ˆβ–ˆβ–ˆβ–‰ | 296/747 [18:08<25:59, 3.46s/it]
40%|β–ˆβ–ˆβ–ˆβ–‰ | 297/747 [18:11<25:10, 3.36s/it]
40%|β–ˆβ–ˆβ–ˆβ–‰ | 298/747 [18:15<25:36, 3.42s/it]
40%|β–ˆβ–ˆβ–ˆβ–ˆ | 299/747 [18:18<25:07, 3.37s/it]
40%|β–ˆβ–ˆβ–ˆβ–ˆ | 300/747 [18:22<26:27, 3.55s/it]
{'loss': 0.3066, 'learning_rate': 3.260533372782234e-05, 'epoch': 0.4}
40%|β–ˆβ–ˆβ–ˆβ–ˆ | 300/747 [18:22<26:27, 3.55s/it]
40%|β–ˆβ–ˆβ–ˆβ–ˆ | 301/747 [18:26<26:38, 3.58s/it]
40%|β–ˆβ–ˆβ–ˆβ–ˆ | 302/747 [18:29<25:58, 3.50s/it]
41%|β–ˆβ–ˆβ–ˆβ–ˆ | 303/747 [18:33<26:09, 3.53s/it]
41%|β–ˆβ–ˆβ–ˆβ–ˆ | 304/747 [18:37<27:24, 3.71s/it]
41%|β–ˆβ–ˆβ–ˆβ–ˆ | 305/747 [18:41<27:28, 3.73s/it]
41%|β–ˆβ–ˆβ–ˆβ–ˆ | 306/747 [18:44<25:59, 3.54s/it]
41%|β–ˆβ–ˆβ–ˆβ–ˆ | 307/747 [18:47<25:11, 3.44s/it]
41%|β–ˆβ–ˆβ–ˆβ–ˆ | 308/747 [18:50<23:23, 3.20s/it]
41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 309/747 [18:53<24:38, 3.38s/it]
41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 310/747 [18:58<27:17, 3.75s/it]
{'loss': 0.3131, 'learning_rate': 3.1597333317036545e-05, 'epoch': 0.41}
41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 310/747 [18:58<27:17, 3.75s/it]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 311/747 [19:02<26:54, 3.70s/it]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 312/747 [19:06<27:53, 3.85s/it]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 313/747 [19:09<27:12, 3.76s/it]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 314/747 [19:14<28:19, 3.92s/it]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 315/747 [19:18<28:29, 3.96s/it]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 316/747 [19:22<28:38, 3.99s/it]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 317/747 [19:25<28:05, 3.92s/it]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 318/747 [19:29<27:32, 3.85s/it]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 319/747 [19:33<27:06, 3.80s/it]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 320/747 [19:36<25:10, 3.54s/it]
{'loss': 0.3101, 'learning_rate': 3.057766580531031e-05, 'epoch': 0.43}
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 320/747 [19:36<25:10, 3.54s/it]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 321/747 [19:40<26:27, 3.73s/it]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 322/747 [19:44<26:45, 3.78s/it]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 323/747 [19:48<26:43, 3.78s/it]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 324/747 [19:51<26:41, 3.79s/it]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 325/747 [19:56<28:04, 3.99s/it]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 326/747 [19:59<26:53, 3.83s/it]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 327/747 [20:03<26:20, 3.76s/it]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 328/747 [20:07<26:03, 3.73s/it]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 329/747 [20:10<25:26, 3.65s/it]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 330/747 [20:14<25:58, 3.74s/it]
{'loss': 0.3022, 'learning_rate': 2.9548134430893604e-05, 'epoch': 0.44}
44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 330/747 [20:14<25:58, 3.74s/it]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 331/747 [20:16<23:05, 3.33s/it]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 332/747 [20:20<23:44, 3.43s/it]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 333/747 [20:24<25:23, 3.68s/it]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 334/747 [20:28<26:10, 3.80s/it]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 335/747 [20:32<24:53, 3.63s/it]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 336/747 [20:35<24:15, 3.54s/it]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 337/747 [20:39<24:43, 3.62s/it]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 338/747 [20:42<23:51, 3.50s/it]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 339/747 [20:46<24:14, 3.56s/it]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 340/747 [20:49<24:44, 3.65s/it]
{'loss': 0.2496, 'learning_rate': 2.8510559875854377e-05, 'epoch': 0.46}
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 340/747 [20:49<24:44, 3.65s/it]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 341/747 [20:53<23:45, 3.51s/it]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 342/747 [20:57<25:42, 3.81s/it]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 343/747 [21:01<25:21, 3.77s/it]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 344/747 [21:04<23:59, 3.57s/it]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 345/747 [21:08<24:33, 3.67s/it]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 346/747 [21:12<24:37, 3.68s/it]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 347/747 [21:15<23:11, 3.48s/it]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 348/747 [21:18<22:02, 3.32s/it]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 349/747 [21:21<21:53, 3.30s/it]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 350/747 [21:25<22:54, 3.46s/it]
{'loss': 0.3218, 'learning_rate': 2.7466777046280457e-05, 'epoch': 0.47}
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 350/747 [21:25<22:54, 3.46s/it]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 351/747 [21:28<22:22, 3.39s/it]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 352/747 [21:32<23:56, 3.64s/it]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 353/747 [21:37<26:17, 4.00s/it]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 354/747 [21:40<23:24, 3.57s/it]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 355/747 [21:43<22:47, 3.49s/it]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 356/747 [21:47<23:38, 3.63s/it]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 357/747 [21:50<22:17, 3.43s/it]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 358/747 [21:53<21:12, 3.27s/it]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 359/747 [21:56<20:35, 3.18s/it]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 360/747 [21:59<21:31, 3.34s/it]
{'loss': 0.3064, 'learning_rate': 2.6418631827326857e-05, 'epoch': 0.48}
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 360/747 [21:59<21:31, 3.34s/it]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 361/747 [22:04<23:59, 3.73s/it]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 362/747 [22:07<22:55, 3.57s/it]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 363/747 [22:10<21:20, 3.34s/it]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 364/747 [22:14<22:11, 3.48s/it]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 365/747 [22:19<25:17, 3.97s/it]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 366/747 [22:22<23:48, 3.75s/it]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 367/747 [22:26<23:32, 3.72s/it]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 368/747 [22:29<23:10, 3.67s/it]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 369/747 [22:34<24:19, 3.86s/it]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 370/747 [22:38<25:14, 4.02s/it]
{'loss': 0.2937, 'learning_rate': 2.5367977818847034e-05, 'epoch': 0.5}
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 370/747 [22:38<25:14, 4.02s/it]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 371/747 [22:42<25:24, 4.06s/it]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 372/747 [22:46<24:14, 3.88s/it]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 373/747 [22:49<22:58, 3.69s/it]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 374/747 [22:54<24:47, 3.99s/it]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 375/747 [22:58<25:13, 4.07s/it]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 376/747 [23:03<27:05, 4.38s/it]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 377/747 [23:07<27:04, 4.39s/it]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 378/747 [23:11<24:52, 4.05s/it]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 379/747 [23:14<24:01, 3.92s/it]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 380/747 [23:19<26:24, 4.32s/it]
{'loss': 0.2708, 'learning_rate': 2.431667305738112e-05, 'epoch': 0.51}
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 380/747 [23:19<26:24, 4.32s/it]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 381/747 [23:24<25:58, 4.26s/it]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 382/747 [23:27<25:18, 4.16s/it]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 383/747 [23:32<26:20, 4.34s/it]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 384/747 [23:36<24:32, 4.06s/it]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 385/747 [23:40<24:17, 4.03s/it]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 386/747 [23:43<23:01, 3.83s/it]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 387/747 [23:46<21:20, 3.56s/it]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 388/747 [23:49<19:52, 3.32s/it]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 389/747 [23:52<20:09, 3.38s/it]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 390/747 [23:56<20:52, 3.51s/it]
{'loss': 0.2838, 'learning_rate': 2.3266576730297956e-05, 'epoch': 0.52}
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 390/747 [23:56<20:52, 3.51s/it]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 391/747 [24:00<21:32, 3.63s/it]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 392/747 [24:03<21:05, 3.57s/it]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 393/747 [24:08<22:32, 3.82s/it]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 394/747 [24:13<24:35, 4.18s/it]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 395/747 [24:17<24:04, 4.10s/it]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 396/747 [24:20<22:05, 3.78s/it]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 397/747 [24:23<21:01, 3.61s/it]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 398/747 [24:27<21:52, 3.76s/it]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 399/747 [24:31<22:11, 3.83s/it]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 400/747 [24:34<20:53, 3.61s/it]
{'loss': 0.2987, 'learning_rate': 2.221954588790206e-05, 'epoch': 0.54}
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 400/747 [24:34<20:53, 3.61s/it]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 401/747 [24:38<20:59, 3.64s/it]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 402/747 [24:42<21:26, 3.73s/it]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 403/747 [24:46<22:52, 3.99s/it]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 404/747 [24:50<22:13, 3.89s/it]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 405/747 [24:54<22:05, 3.88s/it]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 406/747 [24:57<20:58, 3.69s/it]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 407/747 [25:00<20:13, 3.57s/it]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 408/747 [25:04<19:41, 3.49s/it]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 409/747 [25:08<21:09, 3.75s/it]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 410/747 [25:11<20:36, 3.67s/it]
{'loss': 0.3069, 'learning_rate': 2.1177432159319754e-05, 'epoch': 0.55}
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 410/747 [25:12<20:36, 3.67s/it]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 411/747 [25:14<19:19, 3.45s/it]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 412/747 [25:17<17:44, 3.18s/it]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 413/747 [25:21<19:29, 3.50s/it]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 414/747 [25:25<20:06, 3.62s/it]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 415/747 [25:29<19:38, 3.55s/it]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 416/747 [25:32<19:05, 3.46s/it]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 417/747 [25:35<19:06, 3.48s/it]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 418/747 [25:39<19:46, 3.61s/it]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 419/747 [25:43<20:13, 3.70s/it]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 420/747 [25:47<21:08, 3.88s/it]
{'loss': 0.2709, 'learning_rate': 2.014207847797256e-05, 'epoch': 0.56}
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 420/747 [25:47<21:08, 3.88s/it]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 421/747 [25:51<20:23, 3.75s/it]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 422/747 [25:55<21:10, 3.91s/it]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 423/747 [25:59<21:08, 3.91s/it]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 424/747 [26:03<20:37, 3.83s/it]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 425/747 [26:06<19:13, 3.58s/it]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 426/747 [26:10<20:02, 3.75s/it]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 427/747 [26:13<19:42, 3.69s/it]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 428/747 [26:17<18:55, 3.56s/it]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 429/747 [26:22<20:54, 3.94s/it]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 430/747 [26:26<21:09, 4.00s/it]
{'loss': 0.3159, 'learning_rate': 1.9115315822428437e-05, 'epoch': 0.58}
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 430/747 [26:26<21:09, 4.00s/it]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 431/747 [26:28<19:09, 3.64s/it]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 432/747 [26:32<18:38, 3.55s/it]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 433/747 [26:36<19:33, 3.74s/it]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 434/747 [26:40<19:43, 3.78s/it]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 435/747 [26:44<19:32, 3.76s/it]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 436/747 [26:47<19:08, 3.69s/it]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 437/747 [26:51<18:55, 3.66s/it]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 438/747 [26:54<17:42, 3.44s/it]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 439/747 [26:58<18:24, 3.59s/it]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 440/747 [27:01<18:23, 3.60s/it]
{'loss': 0.2997, 'learning_rate': 1.809895997839482e-05, 'epoch': 0.59}
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 440/747 [27:01<18:23, 3.60s/it]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 441/747 [27:04<17:57, 3.52s/it]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 442/747 [27:09<19:10, 3.77s/it]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 443/747 [27:12<17:48, 3.51s/it]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 444/747 [27:16<18:22, 3.64s/it]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 445/747 [27:20<19:12, 3.81s/it]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 446/747 [27:24<19:19, 3.85s/it]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 447/747 [27:27<18:54, 3.78s/it]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 448/747 [27:31<18:21, 3.68s/it]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 449/747 [27:33<16:29, 3.32s/it]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 450/747 [27:37<17:05, 3.45s/it]
{'loss': 0.2347, 'learning_rate': 1.70948083275794e-05, 'epoch': 0.6}
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 450/747 [27:37<17:05, 3.45s/it]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 451/747 [27:40<16:04, 3.26s/it]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 452/747 [27:44<16:34, 3.37s/it]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 453/747 [27:48<17:38, 3.60s/it]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 454/747 [27:52<18:35, 3.81s/it]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 455/747 [27:56<18:44, 3.85s/it]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 456/747 [27:59<17:37, 3.63s/it]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 457/747 [28:02<16:38, 3.44s/it]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 458/747 [28:06<17:10, 3.56s/it]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 459/747 [28:08<15:28, 3.23s/it]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 460/747 [28:12<15:59, 3.34s/it]
{'loss': 0.2634, 'learning_rate': 1.6104636669097776e-05, 'epoch': 0.62}
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 460/747 [28:12<15:59, 3.34s/it]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 461/747 [28:16<17:06, 3.59s/it]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 462/747 [28:20<17:17, 3.64s/it]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 463/747 [28:24<17:45, 3.75s/it]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 464/747 [28:28<17:36, 3.73s/it]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 465/747 [28:31<17:28, 3.72s/it]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 466/747 [28:34<16:12, 3.46s/it]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 467/747 [28:38<16:39, 3.57s/it]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 468/747 [28:42<17:16, 3.72s/it]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 469/747 [28:45<16:16, 3.51s/it]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 470/747 [28:49<16:39, 3.61s/it]
{'loss': 0.2438, 'learning_rate': 1.513019607904882e-05, 'epoch': 0.63}
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 470/747 [28:49<16:39, 3.61s/it]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 471/747 [28:54<18:01, 3.92s/it]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 472/747 [28:57<17:11, 3.75s/it]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 473/747 [29:02<18:23, 4.03s/it]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 474/747 [29:05<17:39, 3.88s/it]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 475/747 [29:08<16:34, 3.66s/it]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 476/747 [29:12<17:06, 3.79s/it]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 477/747 [29:16<16:40, 3.71s/it]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 478/747 [29:19<15:45, 3.52s/it]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 479/747 [29:23<16:19, 3.65s/it]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 480/747 [29:26<16:00, 3.60s/it]
{'loss': 0.2289, 'learning_rate': 1.4173209813811788e-05, 'epoch': 0.64}
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 480/747 [29:26<16:00, 3.60s/it]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 481/747 [29:31<17:03, 3.85s/it]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 482/747 [29:35<17:08, 3.88s/it]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 483/747 [29:39<17:46, 4.04s/it]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 484/747 [29:42<16:22, 3.74s/it]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 485/747 [29:46<16:31, 3.79s/it]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 486/747 [29:50<16:11, 3.72s/it]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 487/747 [29:54<16:30, 3.81s/it]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 488/747 [29:58<17:24, 4.03s/it]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 489/747 [30:01<15:37, 3.63s/it]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 490/747 [30:05<16:41, 3.90s/it]
{'loss': 0.2609, 'learning_rate': 1.3235370262541272e-05, 'epoch': 0.66}
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 490/747 [30:05<16:41, 3.90s/it]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 491/747 [30:10<16:57, 3.97s/it]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 492/747 [30:13<16:43, 3.94s/it]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 493/747 [30:18<17:20, 4.09s/it]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 494/747 [30:21<16:19, 3.87s/it]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 495/747 [30:24<15:20, 3.65s/it]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 496/747 [30:27<14:29, 3.46s/it]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 497/747 [30:31<14:20, 3.44s/it]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 498/747 [30:36<16:10, 3.90s/it]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 499/747 [30:40<16:43, 4.05s/it]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 500/747 [30:43<15:43, 3.82s/it]
{'loss': 0.2724, 'learning_rate': 1.2318335954249669e-05, 'epoch': 0.67}
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 500/747 [30:43<15:43, 3.82s/it]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 501/747 [30:47<15:14, 3.72s/it]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 502/747 [30:51<15:25, 3.78s/it]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 503/747 [30:54<15:08, 3.72s/it]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 504/747 [30:59<15:54, 3.93s/it]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 505/747 [31:02<14:40, 3.64s/it]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 506/747 [31:05<14:33, 3.62s/it]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 507/747 [31:09<13:58, 3.49s/it]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 508/747 [31:12<13:25, 3.37s/it]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 509/747 [31:15<13:47, 3.48s/it]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 510/747 [31:18<13:04, 3.31s/it]
{'loss': 0.272, 'learning_rate': 1.1423728624769695e-05, 'epoch': 0.68}
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 510/747 [31:18<13:04, 3.31s/it]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 511/747 [31:23<14:16, 3.63s/it]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 512/747 [31:27<15:13, 3.89s/it]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 513/747 [31:31<14:49, 3.80s/it]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 514/747 [31:34<14:31, 3.74s/it]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 515/747 [31:37<13:33, 3.51s/it]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 516/747 [31:40<12:37, 3.28s/it]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 517/747 [31:44<13:11, 3.44s/it]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 518/747 [31:47<13:08, 3.45s/it]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 519/747 [31:53<15:06, 3.98s/it]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 520/747 [31:57<15:12, 4.02s/it]
{'loss': 0.303, 'learning_rate': 1.0553130348784182e-05, 'epoch': 0.7}
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 520/747 [31:57<15:12, 4.02s/it]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 521/747 [32:01<15:50, 4.20s/it]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 522/747 [32:05<15:40, 4.18s/it]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 523/747 [32:08<13:53, 3.72s/it]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 524/747 [32:12<13:45, 3.70s/it]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 525/747 [32:15<13:20, 3.60s/it]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 526/747 [32:19<13:54, 3.77s/it]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 527/747 [32:23<13:45, 3.75s/it]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 528/747 [32:27<14:14, 3.90s/it]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 529/747 [32:32<14:50, 4.09s/it]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 530/747 [32:35<14:14, 3.94s/it]
{'loss': 0.2766, 'learning_rate': 9.708080741994868e-06, 'epoch': 0.71}
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 530/747 [32:35<14:14, 3.94s/it]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 531/747 [32:39<14:07, 3.92s/it]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 532/747 [32:44<14:33, 4.06s/it]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 533/747 [32:47<13:52, 3.89s/it]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 534/747 [32:51<13:35, 3.83s/it]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 535/747 [32:55<14:13, 4.02s/it]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 536/747 [32:59<14:05, 4.01s/it]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 537/747 [33:04<14:41, 4.20s/it]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 538/747 [33:07<13:24, 3.85s/it]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 539/747 [33:09<11:57, 3.45s/it]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 540/747 [33:13<12:15, 3.55s/it]
{'loss': 0.2333, 'learning_rate': 8.890074238378074e-06, 'epoch': 0.72}
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 540/747 [33:13<12:15, 3.55s/it]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 541/747 [33:18<13:04, 3.81s/it]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 542/747 [33:21<12:28, 3.65s/it]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 543/747 [33:25<12:43, 3.74s/it]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 544/747 [33:29<12:42, 3.76s/it]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 545/747 [33:32<12:10, 3.61s/it]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 546/747 [33:36<12:35, 3.76s/it]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 547/747 [33:41<13:18, 3.99s/it]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 548/747 [33:44<12:55, 3.90s/it]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 549/747 [33:48<12:34, 3.81s/it]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 550/747 [33:51<11:58, 3.65s/it]
{'loss': 0.2326, 'learning_rate': 8.100557447342327e-06, 'epoch': 0.74}
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 550/747 [33:51<11:58, 3.65s/it]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 551/747 [33:54<11:24, 3.49s/it]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 552/747 [33:58<11:03, 3.40s/it]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 553/747 [34:01<11:06, 3.44s/it]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 554/747 [34:05<11:28, 3.56s/it]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 555/747 [34:08<11:03, 3.45s/it]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 556/747 [34:12<11:45, 3.70s/it]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 557/747 [34:16<12:07, 3.83s/it]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 558/747 [34:20<12:05, 3.84s/it]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 559/747 [34:23<11:20, 3.62s/it]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 560/747 [34:28<12:01, 3.86s/it]
{'loss': 0.2274, 'learning_rate': 7.340926595461687e-06, 'epoch': 0.75}
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 560/747 [34:28<12:01, 3.86s/it]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 561/747 [34:31<10:50, 3.50s/it]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 562/747 [34:34<11:00, 3.57s/it]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 563/747 [34:39<12:11, 3.98s/it]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 564/747 [34:42<11:03, 3.63s/it]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 565/747 [34:46<11:07, 3.67s/it]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 566/747 [34:50<11:08, 3.69s/it]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 567/747 [34:53<10:40, 3.56s/it]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 568/747 [34:57<10:53, 3.65s/it]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 569/747 [35:01<11:30, 3.88s/it]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 570/747 [35:05<11:30, 3.90s/it]
{'loss': 0.3255, 'learning_rate': 6.612525057308949e-06, 'epoch': 0.76}
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 570/747 [35:05<11:30, 3.90s/it]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 571/747 [35:08<10:33, 3.60s/it]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 572/747 [35:12<10:33, 3.62s/it]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 573/747 [35:14<09:37, 3.32s/it]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 574/747 [35:18<10:11, 3.53s/it]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 575/747 [35:22<10:42, 3.73s/it]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 576/747 [35:27<10:57, 3.85s/it]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 577/747 [35:30<10:23, 3.67s/it]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 578/747 [35:33<10:17, 3.65s/it]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 579/747 [35:37<10:18, 3.68s/it]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 580/747 [35:41<10:07, 3.64s/it]
{'loss': 0.2631, 'learning_rate': 5.9166409797553415e-06, 'epoch': 0.78}
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 580/747 [35:41<10:07, 3.64s/it]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 581/747 [35:44<09:34, 3.46s/it]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 582/747 [35:48<10:04, 3.66s/it]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 583/747 [35:52<10:08, 3.71s/it]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 584/747 [35:55<09:59, 3.68s/it]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 585/747 [35:59<09:50, 3.65s/it]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 586/747 [36:03<09:53, 3.69s/it]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 587/747 [36:06<09:25, 3.53s/it]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 588/747 [36:10<09:38, 3.64s/it]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 589/747 [36:13<09:41, 3.68s/it]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 590/747 [36:17<09:35, 3.66s/it]
{'loss': 0.2938, 'learning_rate': 5.254505003938043e-06, 'epoch': 0.79}
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 590/747 [36:17<09:35, 3.66s/it]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 591/747 [36:20<08:35, 3.31s/it]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 592/747 [36:23<08:41, 3.36s/it]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 593/747 [36:26<08:41, 3.38s/it]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 594/747 [36:30<08:25, 3.30s/it]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 595/747 [36:33<08:27, 3.34s/it]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 596/747 [36:37<08:51, 3.52s/it]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 597/747 [36:41<08:51, 3.55s/it]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 598/747 [36:45<09:22, 3.78s/it]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 599/747 [36:49<09:52, 4.01s/it]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 600/747 [36:53<09:16, 3.79s/it]
{'loss': 0.2807, 'learning_rate': 4.627288088924156e-06, 'epoch': 0.8}
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 600/747 [36:53<09:16, 3.79s/it]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 601/747 [36:57<09:51, 4.05s/it]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 602/747 [37:02<09:54, 4.10s/it]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 603/747 [37:06<09:43, 4.05s/it]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 604/747 [37:10<10:14, 4.30s/it]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 605/747 [37:14<09:38, 4.07s/it]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 606/747 [37:19<10:18, 4.39s/it]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 607/747 [37:23<09:34, 4.11s/it]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 608/747 [37:26<09:03, 3.91s/it]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/747 [37:29<08:36, 3.74s/it]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 610/747 [37:34<09:09, 4.01s/it]
{'loss': 0.2909, 'learning_rate': 4.036099440919763e-06, 'epoch': 0.82}
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 610/747 [37:34<09:09, 4.01s/it]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 611/747 [37:38<09:06, 4.02s/it]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 612/747 [37:42<08:48, 3.92s/it]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 613/747 [37:45<08:14, 3.69s/it]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 614/747 [37:50<08:54, 4.02s/it]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 615/747 [37:53<08:34, 3.90s/it]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 616/747 [37:57<08:19, 3.81s/it]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 617/747 [38:01<08:29, 3.92s/it]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 618/747 [38:05<08:18, 3.86s/it]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 619/747 [38:08<07:49, 3.66s/it]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 620/747 [38:11<07:25, 3.51s/it]
{'loss': 0.2997, 'learning_rate': 3.481984551686429e-06, 'epoch': 0.83}
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 620/747 [38:11<07:25, 3.51s/it]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 621/747 [38:15<07:33, 3.60s/it]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 622/747 [38:18<07:20, 3.53s/it]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 623/747 [38:22<07:25, 3.59s/it]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 624/747 [38:25<07:13, 3.53s/it]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 625/747 [38:30<07:46, 3.82s/it]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 626/747 [38:34<07:41, 3.82s/it]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 627/747 [38:38<07:38, 3.82s/it]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 628/747 [38:41<07:38, 3.86s/it]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 629/747 [38:45<07:34, 3.85s/it]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 630/747 [38:48<07:01, 3.60s/it]
{'loss': 0.2424, 'learning_rate': 2.9659233496337786e-06, 'epoch': 0.84}
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 630/747 [38:48<07:01, 3.60s/it]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 631/747 [38:53<07:33, 3.91s/it]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 632/747 [38:56<06:55, 3.62s/it]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 633/747 [38:59<06:31, 3.44s/it]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 634/747 [39:02<06:30, 3.45s/it]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 635/747 [39:06<06:24, 3.43s/it]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 636/747 [39:11<07:07, 3.85s/it]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 637/747 [39:15<07:30, 4.10s/it]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 638/747 [39:19<07:29, 4.12s/it]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 639/747 [39:23<07:16, 4.05s/it]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 640/747 [39:27<07:10, 4.03s/it]
{'loss': 0.2862, 'learning_rate': 2.4888284668582285e-06, 'epoch': 0.86}
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 640/747 [39:27<07:10, 4.03s/it]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 641/747 [39:31<06:52, 3.89s/it]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 642/747 [39:35<06:43, 3.84s/it]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 643/747 [39:39<06:58, 4.03s/it]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 644/747 [39:42<06:32, 3.81s/it]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 645/747 [39:45<06:02, 3.56s/it]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 646/747 [39:49<06:01, 3.58s/it]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 647/747 [39:54<06:29, 3.89s/it]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 648/747 [39:57<06:16, 3.80s/it]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 649/747 [40:01<06:18, 3.87s/it]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 650/747 [40:04<05:47, 3.58s/it]
{'loss': 0.3124, 'learning_rate': 2.051543625192226e-06, 'epoch': 0.87}
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 650/747 [40:04<05:47, 3.58s/it]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 651/747 [40:08<05:40, 3.55s/it]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 652/747 [40:11<05:32, 3.50s/it]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 653/747 [40:15<05:41, 3.63s/it]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 654/747 [40:19<05:45, 3.71s/it]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 655/747 [40:23<05:53, 3.84s/it]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 656/747 [40:26<05:39, 3.73s/it]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 657/747 [40:30<05:35, 3.73s/it]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 658/747 [40:35<05:59, 4.04s/it]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 659/747 [40:39<05:57, 4.06s/it]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 660/747 [40:43<05:52, 4.05s/it]
{'loss': 0.2644, 'learning_rate': 1.6548421441183875e-06, 'epoch': 0.88}
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 660/747 [40:43<05:52, 4.05s/it]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 661/747 [40:46<05:19, 3.72s/it]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 662/747 [40:50<05:34, 3.94s/it]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 663/747 [40:53<05:06, 3.64s/it]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 664/747 [40:57<04:52, 3.53s/it]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 665/747 [41:01<04:57, 3.63s/it]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 666/747 [41:03<04:36, 3.41s/it]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 667/747 [41:08<04:57, 3.72s/it]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 668/747 [41:12<04:56, 3.76s/it]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 669/747 [41:15<04:43, 3.64s/it]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 670/747 [41:18<04:31, 3.52s/it]
{'loss': 0.2622, 'learning_rate': 1.2994255731871963e-06, 'epoch': 0.9}
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 670/747 [41:18<04:31, 3.52s/it]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 671/747 [41:21<04:15, 3.36s/it]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 672/747 [41:24<04:02, 3.23s/it]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 673/747 [41:28<04:02, 3.27s/it]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 674/747 [41:31<04:06, 3.38s/it]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 675/747 [41:35<04:15, 3.55s/it]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 676/747 [41:39<04:20, 3.67s/it]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 677/747 [41:43<04:15, 3.65s/it]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 678/747 [41:46<03:57, 3.44s/it]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 679/747 [41:49<03:57, 3.49s/it]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 680/747 [41:52<03:46, 3.38s/it]
{'loss': 0.2234, 'learning_rate': 9.85922451356694e-07, 'epoch': 0.91}
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 680/747 [41:52<03:46, 3.38s/it]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 681/747 [41:56<03:50, 3.50s/it]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 682/747 [42:01<04:13, 3.90s/it]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 683/747 [42:04<03:55, 3.67s/it]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 684/747 [42:08<03:56, 3.75s/it]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 685/747 [42:11<03:32, 3.43s/it]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 686/747 [42:15<03:51, 3.79s/it]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 687/747 [42:20<03:55, 3.92s/it]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 688/747 [42:23<03:43, 3.78s/it]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 689/747 [42:27<03:33, 3.67s/it]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 690/747 [42:29<03:15, 3.43s/it]
{'loss': 0.334, 'learning_rate': 7.148871954483105e-07, 'epoch': 0.92}
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 690/747 [42:29<03:15, 3.43s/it]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 691/747 [42:33<03:17, 3.53s/it]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 692/747 [42:37<03:21, 3.67s/it]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 693/747 [42:41<03:21, 3.73s/it]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 694/747 [42:45<03:23, 3.83s/it]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 695/747 [42:49<03:19, 3.85s/it]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 696/747 [42:52<03:05, 3.64s/it]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 697/747 [42:56<03:09, 3.79s/it]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 698/747 [43:00<03:02, 3.72s/it]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 699/747 [43:04<02:58, 3.72s/it]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 700/747 [43:07<02:50, 3.64s/it]
{'loss': 0.2915, 'learning_rate': 4.867991196844918e-07, 'epoch': 0.94}
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 700/747 [43:07<02:50, 3.64s/it]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 701/747 [43:11<02:45, 3.60s/it]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 702/747 [43:15<02:47, 3.72s/it]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 703/747 [43:18<02:39, 3.63s/it]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 704/747 [43:22<02:35, 3.61s/it]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 705/747 [43:24<02:22, 3.39s/it]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 706/747 [43:27<02:11, 3.21s/it]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 707/747 [43:30<02:08, 3.21s/it]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 708/747 [43:34<02:15, 3.47s/it]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 709/747 [43:38<02:14, 3.53s/it]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 710/747 [43:42<02:11, 3.56s/it]
{'loss': 0.2802, 'learning_rate': 3.020615880420713e-07, 'epoch': 0.95}
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 710/747 [43:42<02:11, 3.56s/it]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 711/747 [43:46<02:17, 3.83s/it]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 712/747 [43:49<02:02, 3.49s/it]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 713/747 [43:53<02:04, 3.65s/it]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 714/747 [43:56<01:53, 3.43s/it]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 715/747 [43:59<01:44, 3.25s/it]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 716/747 [44:02<01:44, 3.36s/it]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 717/747 [44:06<01:42, 3.43s/it]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 718/747 [44:10<01:44, 3.60s/it]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 719/747 [44:14<01:43, 3.69s/it]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 720/747 [44:17<01:33, 3.47s/it]
{'loss': 0.2975, 'learning_rate': 1.6100130092037703e-07, 'epoch': 0.96}
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 720/747 [44:17<01:33, 3.47s/it]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 721/747 [44:20<01:27, 3.37s/it]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 722/747 [44:23<01:23, 3.34s/it]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 723/747 [44:26<01:18, 3.27s/it]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 724/747 [44:30<01:19, 3.46s/it]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 725/747 [44:34<01:20, 3.66s/it]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 726/747 [44:39<01:23, 3.99s/it]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 727/747 [44:42<01:14, 3.72s/it]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 728/747 [44:46<01:09, 3.65s/it]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 729/747 [44:49<01:05, 3.62s/it]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 730/747 [44:53<01:01, 3.59s/it]
{'loss': 0.2531, 'learning_rate': 6.386771738558506e-08, 'epoch': 0.98}
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 730/747 [44:53<01:01, 3.59s/it]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 731/747 [44:56<00:58, 3.63s/it]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 732/747 [45:01<00:57, 3.86s/it]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 733/747 [45:05<00:56, 4.06s/it]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 734/747 [45:09<00:50, 3.87s/it]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 735/747 [45:12<00:43, 3.62s/it]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 736/747 [45:16<00:40, 3.65s/it]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 737/747 [45:20<00:38, 3.83s/it]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 738/747 [45:24<00:34, 3.80s/it]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 739/747 [45:26<00:27, 3.47s/it]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 740/747 [45:30<00:25, 3.68s/it]
{'loss': 0.2696, 'learning_rate': 1.0832614013073228e-08, 'epoch': 0.99}
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 740/747 [45:30<00:25, 3.68s/it]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 741/747 [45:34<00:21, 3.62s/it]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 742/747 [45:37<00:17, 3.46s/it]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 743/747 [45:41<00:14, 3.51s/it]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 744/747 [45:45<00:11, 3.70s/it]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 745/747 [45:50<00:08, 4.17s/it]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 746/747 [45:53<00:03, 3.86s/it]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 747/747 [45:56<00:00, 3.69s/it][INFO|trainer.py:3166] 2024-01-18 19:16:11,445 >> ***** Running Evaluation *****
[INFO|trainer.py:3168] 2024-01-18 19:16:11,445 >> Num examples = 664
[INFO|trainer.py:3171] 2024-01-18 19:16:11,445 >> Batch size = 1
0%| | 0/332 [00:00<?, ?it/s]
1%| | 2/332 [00:00<00:42, 7.82it/s]
1%| | 3/332 [00:00<00:58, 5.62it/s]
1%| | 4/332 [00:00<01:07, 4.85it/s]
2%|▏ | 5/332 [00:01<01:29, 3.67it/s]
2%|▏ | 6/332 [00:01<01:46, 3.05it/s]
2%|▏ | 7/332 [00:01<01:41, 3.19it/s]
2%|▏ | 8/332 [00:02<01:54, 2.83it/s]
3%|β–Ž | 9/332 [00:02<01:38, 3.27it/s]
3%|β–Ž | 10/332 [00:02<01:51, 2.88it/s]
3%|β–Ž | 11/332 [00:03<01:56, 2.76it/s]
4%|β–Ž | 12/332 [00:03<02:05, 2.55it/s]
4%|▍ | 13/332 [00:04<02:09, 2.47it/s]
4%|▍ | 14/332 [00:04<01:54, 2.78it/s]
5%|▍ | 15/332 [00:04<01:54, 2.77it/s]
5%|▍ | 16/332 [00:05<01:43, 3.06it/s]
5%|β–Œ | 17/332 [00:05<01:32, 3.42it/s]
5%|β–Œ | 18/332 [00:05<01:27, 3.58it/s]
6%|β–Œ | 19/332 [00:05<01:20, 3.88it/s]
6%|β–Œ | 20/332 [00:06<01:23, 3.76it/s]
6%|β–‹ | 21/332 [00:06<01:29, 3.46it/s]
7%|β–‹ | 22/332 [00:06<01:28, 3.49it/s]
7%|β–‹ | 23/332 [00:06<01:24, 3.64it/s]
7%|β–‹ | 24/332 [00:07<01:19, 3.90it/s]
8%|β–Š | 25/332 [00:07<01:17, 3.94it/s]
8%|β–Š | 26/332 [00:07<01:16, 3.99it/s]
8%|β–Š | 27/332 [00:07<01:12, 4.22it/s]
8%|β–Š | 28/332 [00:08<01:26, 3.51it/s]
9%|β–Š | 29/332 [00:08<01:22, 3.65it/s]
9%|β–‰ | 30/332 [00:08<01:15, 4.00it/s]
9%|β–‰ | 31/332 [00:08<01:11, 4.18it/s]
10%|β–‰ | 32/332 [00:09<01:12, 4.15it/s]
10%|β–‰ | 33/332 [00:09<01:16, 3.90it/s]
10%|β–ˆ | 34/332 [00:09<01:29, 3.32it/s]
11%|β–ˆ | 35/332 [00:10<01:39, 2.97it/s]
11%|β–ˆ | 36/332 [00:10<01:31, 3.23it/s]
11%|β–ˆ | 37/332 [00:10<01:35, 3.08it/s]
11%|β–ˆβ– | 38/332 [00:11<01:24, 3.46it/s]
12%|β–ˆβ– | 39/332 [00:11<01:39, 2.94it/s]
12%|β–ˆβ– | 40/332 [00:11<01:39, 2.95it/s]
12%|β–ˆβ– | 41/332 [00:12<01:38, 2.96it/s]
13%|β–ˆβ–Ž | 42/332 [00:12<01:33, 3.10it/s]
13%|β–ˆβ–Ž | 43/332 [00:12<01:32, 3.11it/s]
13%|β–ˆβ–Ž | 44/332 [00:12<01:19, 3.61it/s]
14%|β–ˆβ–Ž | 45/332 [00:13<01:36, 2.97it/s]
14%|β–ˆβ– | 46/332 [00:13<01:33, 3.07it/s]
14%|β–ˆβ– | 47/332 [00:14<01:30, 3.16it/s]
14%|β–ˆβ– | 48/332 [00:14<01:21, 3.48it/s]
15%|β–ˆβ– | 49/332 [00:14<01:18, 3.61it/s]
15%|β–ˆβ–Œ | 50/332 [00:14<01:15, 3.74it/s]
15%|β–ˆβ–Œ | 51/332 [00:15<01:13, 3.84it/s]
16%|β–ˆβ–Œ | 52/332 [00:15<01:04, 4.34it/s]
16%|β–ˆβ–Œ | 53/332 [00:15<01:01, 4.52it/s]
16%|β–ˆβ–‹ | 54/332 [00:15<01:00, 4.56it/s]
17%|β–ˆβ–‹ | 55/332 [00:15<01:10, 3.95it/s]
17%|β–ˆβ–‹ | 56/332 [00:16<01:12, 3.78it/s]
17%|β–ˆβ–‹ | 57/332 [00:16<01:08, 4.04it/s]
17%|β–ˆβ–‹ | 58/332 [00:16<01:00, 4.52it/s]
18%|β–ˆβ–Š | 59/332 [00:16<01:02, 4.36it/s]
18%|β–ˆβ–Š | 60/332 [00:17<01:06, 4.08it/s]
18%|β–ˆβ–Š | 61/332 [00:17<01:06, 4.06it/s]
19%|β–ˆβ–Š | 62/332 [00:17<01:09, 3.87it/s]
19%|β–ˆβ–‰ | 63/332 [00:18<01:24, 3.20it/s]
19%|β–ˆβ–‰ | 64/332 [00:18<01:34, 2.84it/s]
20%|β–ˆβ–‰ | 65/332 [00:18<01:33, 2.87it/s]
20%|β–ˆβ–‰ | 66/332 [00:19<01:39, 2.66it/s]
20%|β–ˆβ–ˆ | 67/332 [00:19<01:30, 2.94it/s]
20%|β–ˆβ–ˆ | 68/332 [00:19<01:21, 3.23it/s]
21%|β–ˆβ–ˆ | 69/332 [00:20<01:31, 2.87it/s]
21%|β–ˆβ–ˆ | 70/332 [00:20<01:23, 3.14it/s]
21%|β–ˆβ–ˆβ– | 71/332 [00:20<01:20, 3.23it/s]
22%|β–ˆβ–ˆβ– | 72/332 [00:21<01:15, 3.45it/s]
22%|β–ˆβ–ˆβ– | 73/332 [00:21<01:11, 3.60it/s]
22%|β–ˆβ–ˆβ– | 74/332 [00:21<01:24, 3.07it/s]
23%|β–ˆβ–ˆβ–Ž | 75/332 [00:22<01:32, 2.77it/s]
23%|β–ˆβ–ˆβ–Ž | 76/332 [00:22<01:23, 3.07it/s]
23%|β–ˆβ–ˆβ–Ž | 77/332 [00:22<01:16, 3.32it/s]
23%|β–ˆβ–ˆβ–Ž | 78/332 [00:22<01:12, 3.48it/s]
24%|β–ˆβ–ˆβ– | 79/332 [00:23<01:09, 3.62it/s]
24%|β–ˆβ–ˆβ– | 80/332 [00:23<01:22, 3.06it/s]
24%|β–ˆβ–ˆβ– | 81/332 [00:24<01:30, 2.78it/s]
25%|β–ˆβ–ˆβ– | 82/332 [00:24<01:21, 3.06it/s]
25%|β–ˆβ–ˆβ–Œ | 83/332 [00:24<01:30, 2.76it/s]
25%|β–ˆβ–ˆβ–Œ | 84/332 [00:25<01:25, 2.90it/s]
26%|β–ˆβ–ˆβ–Œ | 85/332 [00:25<01:26, 2.85it/s]
26%|β–ˆβ–ˆβ–Œ | 86/332 [00:25<01:18, 3.13it/s]
26%|β–ˆβ–ˆβ–Œ | 87/332 [00:25<01:12, 3.37it/s]
27%|β–ˆβ–ˆβ–‹ | 88/332 [00:26<01:12, 3.38it/s]
27%|β–ˆβ–ˆβ–‹ | 89/332 [00:26<01:08, 3.56it/s]
27%|β–ˆβ–ˆβ–‹ | 90/332 [00:26<01:03, 3.81it/s]
27%|β–ˆβ–ˆβ–‹ | 91/332 [00:26<00:59, 4.07it/s]
28%|β–ˆβ–ˆβ–Š | 92/332 [00:27<00:59, 4.06it/s]
28%|β–ˆβ–ˆβ–Š | 93/332 [00:27<01:07, 3.52it/s]
28%|β–ˆβ–ˆβ–Š | 94/332 [00:27<01:05, 3.64it/s]
29%|β–ˆβ–ˆβ–Š | 95/332 [00:28<01:12, 3.27it/s]
29%|β–ˆβ–ˆβ–‰ | 96/332 [00:28<01:20, 2.92it/s]
29%|β–ˆβ–ˆβ–‰ | 97/332 [00:28<01:11, 3.27it/s]
30%|β–ˆβ–ˆβ–‰ | 98/332 [00:28<01:05, 3.57it/s]
30%|β–ˆβ–ˆβ–‰ | 99/332 [00:29<01:00, 3.86it/s]
30%|β–ˆβ–ˆβ–ˆ | 100/332 [00:29<01:13, 3.17it/s]
30%|β–ˆβ–ˆβ–ˆ | 101/332 [00:29<01:14, 3.09it/s]
31%|β–ˆβ–ˆβ–ˆ | 102/332 [00:30<01:22, 2.79it/s]
31%|β–ˆβ–ˆβ–ˆ | 103/332 [00:30<01:27, 2.61it/s]
31%|β–ˆβ–ˆβ–ˆβ– | 104/332 [00:31<01:15, 3.01it/s]
32%|β–ˆβ–ˆβ–ˆβ– | 105/332 [00:31<01:03, 3.56it/s]
32%|β–ˆβ–ˆβ–ˆβ– | 106/332 [00:31<01:11, 3.16it/s]
32%|β–ˆβ–ˆβ–ˆβ– | 107/332 [00:31<01:06, 3.37it/s]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 108/332 [00:32<01:09, 3.21it/s]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 109/332 [00:32<01:15, 2.95it/s]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 110/332 [00:33<01:22, 2.70it/s]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 111/332 [00:33<01:19, 2.76it/s]
34%|β–ˆβ–ˆβ–ˆβ–Ž | 112/332 [00:33<01:14, 2.94it/s]
34%|β–ˆβ–ˆβ–ˆβ– | 113/332 [00:34<01:22, 2.64it/s]
34%|β–ˆβ–ˆβ–ˆβ– | 114/332 [00:34<01:17, 2.83it/s]
35%|β–ˆβ–ˆβ–ˆβ– | 115/332 [00:34<01:09, 3.10it/s]
35%|β–ˆβ–ˆβ–ˆβ– | 116/332 [00:34<01:04, 3.33it/s]
35%|β–ˆβ–ˆβ–ˆβ–Œ | 117/332 [00:35<00:55, 3.86it/s]
36%|β–ˆβ–ˆβ–ˆβ–Œ | 118/332 [00:35<00:57, 3.73it/s]
36%|β–ˆβ–ˆβ–ˆβ–Œ | 119/332 [00:35<00:59, 3.61it/s]
36%|β–ˆβ–ˆβ–ˆβ–Œ | 120/332 [00:36<01:08, 3.11it/s]
36%|β–ˆβ–ˆβ–ˆβ–‹ | 121/332 [00:36<01:05, 3.21it/s]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 122/332 [00:36<01:04, 3.28it/s]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 123/332 [00:37<01:02, 3.33it/s]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 124/332 [00:37<00:57, 3.62it/s]
38%|β–ˆβ–ˆβ–ˆβ–Š | 125/332 [00:37<00:52, 3.96it/s]
38%|β–ˆβ–ˆβ–ˆβ–Š | 126/332 [00:37<01:03, 3.23it/s]
38%|β–ˆβ–ˆβ–ˆβ–Š | 127/332 [00:38<01:12, 2.84it/s]
39%|β–ˆβ–ˆβ–ˆβ–Š | 128/332 [00:38<01:17, 2.64it/s]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 129/332 [00:39<01:14, 2.73it/s]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 130/332 [00:39<01:03, 3.16it/s]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 131/332 [00:39<00:59, 3.38it/s]
40%|β–ˆβ–ˆβ–ˆβ–‰ | 132/332 [00:39<00:54, 3.66it/s]
40%|β–ˆβ–ˆβ–ˆβ–ˆ | 133/332 [00:40<01:06, 3.02it/s]
40%|β–ˆβ–ˆβ–ˆβ–ˆ | 134/332 [00:40<01:00, 3.28it/s]
41%|β–ˆβ–ˆβ–ˆβ–ˆ | 135/332 [00:40<00:56, 3.49it/s]
41%|β–ˆβ–ˆβ–ˆβ–ˆ | 136/332 [00:41<01:06, 2.96it/s]
41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 137/332 [00:41<01:06, 2.94it/s]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 138/332 [00:41<01:00, 3.19it/s]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 139/332 [00:42<00:59, 3.26it/s]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 140/332 [00:42<01:06, 2.88it/s]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 141/332 [00:42<01:12, 2.62it/s]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 142/332 [00:43<01:06, 2.84it/s]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 143/332 [00:43<01:02, 3.00it/s]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 144/332 [00:43<00:57, 3.27it/s]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 145/332 [00:44<01:04, 2.89it/s]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 146/332 [00:44<01:09, 2.67it/s]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 147/332 [00:44<01:00, 3.05it/s]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 148/332 [00:45<01:01, 3.01it/s]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 149/332 [00:45<00:54, 3.37it/s]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 150/332 [00:45<00:59, 3.06it/s]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 151/332 [00:46<00:57, 3.17it/s]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 152/332 [00:46<00:57, 3.10it/s]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 153/332 [00:46<01:04, 2.78it/s]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 154/332 [00:47<01:08, 2.61it/s]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 155/332 [00:47<01:00, 2.91it/s]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 156/332 [00:47<00:53, 3.26it/s]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 157/332 [00:48<00:52, 3.31it/s]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 158/332 [00:48<00:51, 3.37it/s]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 159/332 [00:48<00:49, 3.53it/s]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 160/332 [00:49<00:55, 3.09it/s]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 161/332 [00:49<00:58, 2.91it/s]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 162/332 [00:49<01:03, 2.68it/s]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 163/332 [00:50<00:56, 2.98it/s]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 164/332 [00:50<00:47, 3.54it/s]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 165/332 [00:50<00:55, 3.03it/s]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 166/332 [00:50<00:50, 3.28it/s]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 167/332 [00:51<00:46, 3.58it/s]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 168/332 [00:51<00:53, 3.04it/s]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 169/332 [00:52<00:59, 2.75it/s]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 170/332 [00:52<00:53, 3.04it/s]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 171/332 [00:52<00:47, 3.42it/s]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 172/332 [00:52<00:42, 3.74it/s]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 173/332 [00:53<00:41, 3.83it/s]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 174/332 [00:53<00:44, 3.54it/s]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 175/332 [00:53<00:46, 3.37it/s]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 176/332 [00:53<00:43, 3.57it/s]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 177/332 [00:54<00:49, 3.11it/s]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 178/332 [00:54<00:46, 3.32it/s]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 179/332 [00:54<00:45, 3.38it/s]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 180/332 [00:55<00:42, 3.56it/s]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 181/332 [00:55<00:42, 3.53it/s]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 182/332 [00:55<00:36, 4.08it/s]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 183/332 [00:55<00:34, 4.30it/s]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 184/332 [00:56<00:37, 3.99it/s]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 185/332 [00:56<00:40, 3.61it/s]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 186/332 [00:56<00:48, 2.99it/s]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 187/332 [00:57<00:46, 3.11it/s]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 188/332 [00:57<00:41, 3.50it/s]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 189/332 [00:57<00:43, 3.28it/s]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 190/332 [00:57<00:39, 3.58it/s]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 191/332 [00:58<00:45, 3.09it/s]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 192/332 [00:58<00:46, 3.02it/s]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 193/332 [00:59<00:44, 3.12it/s]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 194/332 [00:59<00:43, 3.20it/s]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 195/332 [00:59<00:39, 3.43it/s]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 196/332 [00:59<00:37, 3.61it/s]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 197/332 [00:59<00:34, 3.86it/s]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 198/332 [01:00<00:42, 3.18it/s]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 199/332 [01:00<00:40, 3.26it/s]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 200/332 [01:00<00:36, 3.57it/s]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 201/332 [01:01<00:39, 3.28it/s]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 202/332 [01:01<00:44, 2.90it/s]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 203/332 [01:01<00:39, 3.29it/s]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 204/332 [01:02<00:42, 3.00it/s]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 205/332 [01:02<00:40, 3.13it/s]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 206/332 [01:02<00:38, 3.25it/s]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 207/332 [01:03<00:37, 3.30it/s]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 208/332 [01:03<00:35, 3.50it/s]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 209/332 [01:03<00:33, 3.64it/s]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 210/332 [01:03<00:32, 3.76it/s]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 211/332 [01:04<00:37, 3.23it/s]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 212/332 [01:04<00:41, 2.86it/s]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 213/332 [01:05<00:36, 3.22it/s]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 214/332 [01:05<00:37, 3.11it/s]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 215/332 [01:05<00:36, 3.19it/s]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 216/332 [01:05<00:30, 3.76it/s]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 217/332 [01:06<00:29, 3.84it/s]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 218/332 [01:06<00:32, 3.51it/s]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 219/332 [01:06<00:31, 3.63it/s]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 220/332 [01:06<00:28, 3.89it/s]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 221/332 [01:07<00:28, 3.94it/s]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 222/332 [01:07<00:33, 3.33it/s]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 223/332 [01:07<00:31, 3.48it/s]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 224/332 [01:08<00:36, 2.99it/s]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 225/332 [01:08<00:34, 3.14it/s]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 226/332 [01:08<00:32, 3.22it/s]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 227/332 [01:09<00:31, 3.29it/s]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 228/332 [01:09<00:27, 3.83it/s]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 229/332 [01:09<00:32, 3.18it/s]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 230/332 [01:10<00:33, 3.09it/s]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 231/332 [01:10<00:29, 3.46it/s]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 232/332 [01:10<00:26, 3.78it/s]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 233/332 [01:10<00:24, 3.99it/s]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 234/332 [01:11<00:30, 3.18it/s]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 235/332 [01:11<00:29, 3.28it/s]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 236/332 [01:11<00:26, 3.63it/s]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 237/332 [01:11<00:24, 3.88it/s]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 238/332 [01:12<00:26, 3.55it/s]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 239/332 [01:12<00:26, 3.50it/s]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 240/332 [01:12<00:27, 3.31it/s]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 241/332 [01:13<00:24, 3.69it/s]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 242/332 [01:13<00:29, 3.09it/s]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 243/332 [01:13<00:24, 3.64it/s]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 244/332 [01:13<00:24, 3.60it/s]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 245/332 [01:14<00:24, 3.55it/s]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 246/332 [01:14<00:28, 3.04it/s]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 247/332 [01:14<00:25, 3.27it/s]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 248/332 [01:15<00:29, 2.89it/s]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 249/332 [01:15<00:26, 3.17it/s]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 250/332 [01:15<00:23, 3.55it/s]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 251/332 [01:16<00:23, 3.52it/s]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 252/332 [01:16<00:22, 3.52it/s]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 253/332 [01:16<00:24, 3.25it/s]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 254/332 [01:17<00:27, 2.82it/s]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 255/332 [01:17<00:28, 2.66it/s]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 256/332 [01:17<00:24, 3.05it/s]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 257/332 [01:18<00:27, 2.75it/s]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 258/332 [01:18<00:25, 2.94it/s]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 259/332 [01:18<00:23, 3.08it/s]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 260/332 [01:19<00:21, 3.34it/s]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 261/332 [01:19<00:22, 3.19it/s]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 262/332 [01:19<00:20, 3.38it/s]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 263/332 [01:20<00:21, 3.25it/s]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 264/332 [01:20<00:21, 3.15it/s]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 265/332 [01:20<00:18, 3.55it/s]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 266/332 [01:21<00:21, 3.02it/s]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 267/332 [01:21<00:23, 2.78it/s]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 268/332 [01:21<00:21, 2.97it/s]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 269/332 [01:21<00:19, 3.22it/s]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 270/332 [01:22<00:18, 3.43it/s]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 271/332 [01:22<00:20, 3.00it/s]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 272/332 [01:22<00:17, 3.33it/s]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 273/332 [01:23<00:19, 3.04it/s]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 274/332 [01:23<00:18, 3.15it/s]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 275/332 [01:23<00:16, 3.50it/s]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 276/332 [01:24<00:15, 3.61it/s]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 277/332 [01:24<00:16, 3.38it/s]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 278/332 [01:24<00:15, 3.55it/s]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 279/332 [01:24<00:13, 3.86it/s]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 280/332 [01:25<00:13, 3.93it/s]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 281/332 [01:25<00:15, 3.23it/s]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 282/332 [01:25<00:16, 3.12it/s]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 283/332 [01:26<00:15, 3.19it/s]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 284/332 [01:26<00:15, 3.11it/s]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 285/332 [01:26<00:14, 3.33it/s]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 286/332 [01:26<00:12, 3.62it/s]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 287/332 [01:27<00:14, 3.11it/s]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 288/332 [01:27<00:12, 3.51it/s]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 289/332 [01:27<00:11, 3.63it/s]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 290/332 [01:28<00:10, 3.91it/s]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 291/332 [01:28<00:10, 3.74it/s]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 292/332 [01:28<00:12, 3.14it/s]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 293/332 [01:29<00:11, 3.38it/s]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 294/332 [01:29<00:12, 3.06it/s]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 295/332 [01:29<00:11, 3.16it/s]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 296/332 [01:30<00:11, 3.08it/s]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 297/332 [01:30<00:10, 3.42it/s]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 298/332 [01:30<00:09, 3.56it/s]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 299/332 [01:30<00:10, 3.05it/s]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 300/332 [01:31<00:09, 3.27it/s]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 301/332 [01:31<00:09, 3.44it/s]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 302/332 [01:31<00:08, 3.61it/s]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 303/332 [01:31<00:08, 3.57it/s]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 304/332 [01:32<00:07, 3.88it/s]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 305/332 [01:32<00:07, 3.43it/s]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 306/332 [01:32<00:07, 3.26it/s]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 307/332 [01:33<00:08, 2.88it/s]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 308/332 [01:33<00:09, 2.65it/s]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 309/332 [01:34<00:09, 2.51it/s]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 310/332 [01:34<00:08, 2.74it/s]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 311/332 [01:34<00:06, 3.03it/s]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 312/332 [01:35<00:06, 3.26it/s]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 313/332 [01:35<00:06, 2.89it/s]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 314/332 [01:35<00:05, 3.01it/s]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 315/332 [01:36<00:05, 3.23it/s]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 316/332 [01:36<00:04, 3.57it/s]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 317/332 [01:36<00:04, 3.69it/s]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 318/332 [01:36<00:03, 3.76it/s]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 319/332 [01:37<00:03, 3.83it/s]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 320/332 [01:37<00:02, 4.04it/s]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 321/332 [01:37<00:02, 4.23it/s]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 322/332 [01:37<00:02, 4.19it/s]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 323/332 [01:37<00:02, 4.32it/s]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 324/332 [01:38<00:02, 3.74it/s]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 325/332 [01:38<00:01, 3.68it/s]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 326/332 [01:38<00:01, 3.78it/s]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 327/332 [01:39<00:01, 3.70it/s]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 328/332 [01:39<00:01, 3.43it/s]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 329/332 [01:39<00:01, 2.97it/s]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 330/332 [01:40<00:00, 2.97it/s]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 331/332 [01:40<00:00, 3.54it/s]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 332/332 [01:40<00:00, 3.03it/s]
{'eval_loss': 0.27866294980049133, 'eval_runtime': 101.2195, 'eval_samples_per_second': 6.56, 'eval_steps_per_second': 3.28, 'epoch': 1.0}
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 747/747 [47:38<00:00, 3.69s/it]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 332/332 [01:40<00:00, 3.03it/s]
[INFO|trainer.py:1947] 2024-01-18 19:17:52,666 >>
Training completed. Do not forget to share your model on huggingface.co/models =)
{'train_runtime': 2859.8545, 'train_samples_per_second': 2.089, 'train_steps_per_second': 0.261, 'train_loss': 0.3299662241814446, 'epoch': 1.0}
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 747/747 [47:38<00:00, 3.69s/it]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 747/747 [47:38<00:00, 3.83s/it]
[INFO|trainer.py:2889] 2024-01-18 19:17:52,669 >> Saving model checkpoint to ./models/sft/LMCocktail-10.7B-v1-sft-glaive-function-calling-v2-ep1-lora
[INFO|tokenization_utils_base.py:2432] 2024-01-18 19:17:52,742 >> tokenizer config file saved in ./models/sft/LMCocktail-10.7B-v1-sft-glaive-function-calling-v2-ep1-lora/tokenizer_config.json
[INFO|tokenization_utils_base.py:2441] 2024-01-18 19:17:52,742 >> Special tokens file saved in ./models/sft/LMCocktail-10.7B-v1-sft-glaive-function-calling-v2-ep1-lora/special_tokens_map.json
***** train metrics *****
epoch = 1.0
train_loss = 0.33
train_runtime = 0:47:39.85
train_samples_per_second = 2.089
train_steps_per_second = 0.261
Figure saved: ./models/sft/LMCocktail-10.7B-v1-sft-glaive-function-calling-v2-ep1-lora/training_loss.png
Figure saved: ./models/sft/LMCocktail-10.7B-v1-sft-glaive-function-calling-v2-ep1-lora/training_eval_loss.png
[INFO|trainer.py:3166] 2024-01-18 19:17:55,818 >> ***** Running Evaluation *****
[INFO|trainer.py:3168] 2024-01-18 19:17:55,818 >> Num examples = 664
[INFO|trainer.py:3171] 2024-01-18 19:17:55,818 >> Batch size = 1
0%| | 0/332 [00:00<?, ?it/s]
1%| | 2/332 [00:00<00:41, 7.94it/s]
1%| | 3/332 [00:00<00:58, 5.66it/s]
1%| | 4/332 [00:00<01:07, 4.85it/s]
2%|▏ | 5/332 [00:01<01:29, 3.66it/s]
2%|▏ | 6/332 [00:01<01:46, 3.05it/s]
2%|▏ | 7/332 [00:01<01:42, 3.19it/s]
2%|▏ | 8/332 [00:02<01:54, 2.83it/s]
3%|β–Ž | 9/332 [00:02<01:38, 3.26it/s]
3%|β–Ž | 10/332 [00:02<01:51, 2.88it/s]
3%|β–Ž | 11/332 [00:03<01:56, 2.76it/s]
4%|β–Ž | 12/332 [00:03<02:05, 2.54it/s]
4%|▍ | 13/332 [00:04<02:09, 2.46it/s]
4%|▍ | 14/332 [00:04<01:54, 2.77it/s]
5%|▍ | 15/332 [00:04<01:55, 2.76it/s]
5%|▍ | 16/332 [00:05<01:43, 3.04it/s]
5%|β–Œ | 17/332 [00:05<01:32, 3.40it/s]
5%|β–Œ | 18/332 [00:05<01:28, 3.56it/s]
6%|β–Œ | 19/332 [00:05<01:21, 3.84it/s]
6%|β–Œ | 20/332 [00:06<01:23, 3.73it/s]
6%|β–‹ | 21/332 [00:06<01:30, 3.45it/s]
7%|β–‹ | 22/332 [00:06<01:29, 3.48it/s]
7%|β–‹ | 23/332 [00:06<01:25, 3.62it/s]
7%|β–‹ | 24/332 [00:07<01:19, 3.88it/s]
8%|β–Š | 25/332 [00:07<01:18, 3.93it/s]
8%|β–Š | 26/332 [00:07<01:16, 3.98it/s]
8%|β–Š | 27/332 [00:07<01:12, 4.19it/s]
8%|β–Š | 28/332 [00:08<01:27, 3.49it/s]
9%|β–Š | 29/332 [00:08<01:23, 3.64it/s]
9%|β–‰ | 30/332 [00:08<01:15, 3.98it/s]
9%|β–‰ | 31/332 [00:08<01:12, 4.17it/s]
10%|β–‰ | 32/332 [00:09<01:12, 4.13it/s]
10%|β–‰ | 33/332 [00:09<01:16, 3.89it/s]
10%|β–ˆ | 34/332 [00:09<01:29, 3.32it/s]
11%|β–ˆ | 35/332 [00:10<01:39, 2.97it/s]
11%|β–ˆ | 36/332 [00:10<01:31, 3.23it/s]
11%|β–ˆ | 37/332 [00:10<01:35, 3.08it/s]
11%|β–ˆβ– | 38/332 [00:11<01:25, 3.44it/s]
12%|β–ˆβ– | 39/332 [00:11<01:38, 2.97it/s]
12%|β–ˆβ– | 40/332 [00:11<01:38, 2.98it/s]
12%|β–ˆβ– | 41/332 [00:12<01:37, 2.98it/s]
13%|β–ˆβ–Ž | 42/332 [00:12<01:32, 3.13it/s]
13%|β–ˆβ–Ž | 43/332 [00:12<01:29, 3.22it/s]
13%|β–ˆβ–Ž | 44/332 [00:12<01:16, 3.76it/s]
14%|β–ˆβ–Ž | 45/332 [00:13<01:33, 3.05it/s]
14%|β–ˆβ– | 46/332 [00:13<01:30, 3.14it/s]
14%|β–ˆβ– | 47/332 [00:14<01:28, 3.23it/s]
14%|β–ˆβ– | 48/332 [00:14<01:20, 3.53it/s]
15%|β–ˆβ– | 49/332 [00:14<01:17, 3.64it/s]
15%|β–ˆβ–Œ | 50/332 [00:14<01:14, 3.76it/s]
15%|β–ˆβ–Œ | 51/332 [00:14<01:12, 3.85it/s]
16%|β–ˆβ–Œ | 52/332 [00:15<01:04, 4.36it/s]
16%|β–ˆβ–Œ | 53/332 [00:15<01:01, 4.53it/s]
16%|β–ˆβ–‹ | 54/332 [00:15<01:00, 4.56it/s]
17%|β–ˆβ–‹ | 55/332 [00:15<01:10, 3.95it/s]
17%|β–ˆβ–‹ | 56/332 [00:16<01:13, 3.78it/s]
17%|β–ˆβ–‹ | 57/332 [00:16<01:08, 4.04it/s]
17%|β–ˆβ–‹ | 58/332 [00:16<01:00, 4.51it/s]
18%|β–ˆβ–Š | 59/332 [00:16<01:02, 4.35it/s]
18%|β–ˆβ–Š | 60/332 [00:17<01:06, 4.07it/s]
18%|β–ˆβ–Š | 61/332 [00:17<01:06, 4.06it/s]
19%|β–ˆβ–Š | 62/332 [00:17<01:09, 3.86it/s]
19%|β–ˆβ–‰ | 63/332 [00:18<01:24, 3.19it/s]
19%|β–ˆβ–‰ | 64/332 [00:18<01:34, 2.84it/s]
20%|β–ˆβ–‰ | 65/332 [00:18<01:33, 2.87it/s]
20%|β–ˆβ–‰ | 66/332 [00:19<01:39, 2.66it/s]
20%|β–ˆβ–ˆ | 67/332 [00:19<01:30, 2.94it/s]
20%|β–ˆβ–ˆ | 68/332 [00:19<01:21, 3.23it/s]
21%|β–ˆβ–ˆ | 69/332 [00:20<01:31, 2.87it/s]
21%|β–ˆβ–ˆ | 70/332 [00:20<01:23, 3.14it/s]
21%|β–ˆβ–ˆβ– | 71/332 [00:20<01:20, 3.23it/s]
22%|β–ˆβ–ˆβ– | 72/332 [00:20<01:15, 3.44it/s]
22%|β–ˆβ–ˆβ– | 73/332 [00:21<01:12, 3.59it/s]
22%|β–ˆβ–ˆβ– | 74/332 [00:21<01:24, 3.06it/s]
23%|β–ˆβ–ˆβ–Ž | 75/332 [00:22<01:33, 2.76it/s]
23%|β–ˆβ–ˆβ–Ž | 76/332 [00:22<01:23, 3.06it/s]
23%|β–ˆβ–ˆβ–Ž | 77/332 [00:22<01:17, 3.31it/s]
23%|β–ˆβ–ˆβ–Ž | 78/332 [00:22<01:12, 3.48it/s]
24%|β–ˆβ–ˆβ– | 79/332 [00:23<01:09, 3.62it/s]
24%|β–ˆβ–ˆβ– | 80/332 [00:23<01:22, 3.06it/s]
24%|β–ˆβ–ˆβ– | 81/332 [00:24<01:30, 2.78it/s]
25%|β–ˆβ–ˆβ– | 82/332 [00:24<01:21, 3.05it/s]
25%|β–ˆβ–ˆβ–Œ | 83/332 [00:24<01:30, 2.76it/s]
25%|β–ˆβ–ˆβ–Œ | 84/332 [00:25<01:25, 2.91it/s]
26%|β–ˆβ–ˆβ–Œ | 85/332 [00:25<01:26, 2.85it/s]
26%|β–ˆβ–ˆβ–Œ | 86/332 [00:25<01:18, 3.13it/s]
26%|β–ˆβ–ˆβ–Œ | 87/332 [00:25<01:12, 3.37it/s]
27%|β–ˆβ–ˆβ–‹ | 88/332 [00:26<01:12, 3.38it/s]
27%|β–ˆβ–ˆβ–‹ | 89/332 [00:26<01:08, 3.55it/s]
27%|β–ˆβ–ˆβ–‹ | 90/332 [00:26<01:03, 3.81it/s]
27%|β–ˆβ–ˆβ–‹ | 91/332 [00:26<00:59, 4.07it/s]
28%|β–ˆβ–ˆβ–Š | 92/332 [00:27<00:59, 4.06it/s]
28%|β–ˆβ–ˆβ–Š | 93/332 [00:27<01:07, 3.52it/s]
28%|β–ˆβ–ˆβ–Š | 94/332 [00:27<01:05, 3.64it/s]
29%|β–ˆβ–ˆβ–Š | 95/332 [00:28<01:12, 3.26it/s]
29%|β–ˆβ–ˆβ–‰ | 96/332 [00:28<01:20, 2.92it/s]
29%|β–ˆβ–ˆβ–‰ | 97/332 [00:28<01:11, 3.27it/s]
30%|β–ˆβ–ˆβ–‰ | 98/332 [00:28<01:05, 3.56it/s]
30%|β–ˆβ–ˆβ–‰ | 99/332 [00:29<01:00, 3.85it/s]
30%|β–ˆβ–ˆβ–ˆ | 100/332 [00:29<01:13, 3.17it/s]
30%|β–ˆβ–ˆβ–ˆ | 101/332 [00:29<01:14, 3.08it/s]
31%|β–ˆβ–ˆβ–ˆ | 102/332 [00:30<01:22, 2.78it/s]
31%|β–ˆβ–ˆβ–ˆ | 103/332 [00:30<01:27, 2.61it/s]
31%|β–ˆβ–ˆβ–ˆβ– | 104/332 [00:31<01:15, 3.00it/s]
32%|β–ˆβ–ˆβ–ˆβ– | 105/332 [00:31<01:04, 3.54it/s]
32%|β–ˆβ–ˆβ–ˆβ– | 106/332 [00:31<01:11, 3.15it/s]
32%|β–ˆβ–ˆβ–ˆβ– | 107/332 [00:31<01:06, 3.36it/s]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 108/332 [00:32<01:09, 3.21it/s]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 109/332 [00:32<01:15, 2.94it/s]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 110/332 [00:33<01:22, 2.69it/s]
33%|β–ˆβ–ˆβ–ˆβ–Ž | 111/332 [00:33<01:20, 2.75it/s]
34%|β–ˆβ–ˆβ–ˆβ–Ž | 112/332 [00:33<01:14, 2.93it/s]
34%|β–ˆβ–ˆβ–ˆβ– | 113/332 [00:34<01:23, 2.64it/s]
34%|β–ˆβ–ˆβ–ˆβ– | 114/332 [00:34<01:17, 2.82it/s]
35%|β–ˆβ–ˆβ–ˆβ– | 115/332 [00:34<01:10, 3.10it/s]
35%|β–ˆβ–ˆβ–ˆβ– | 116/332 [00:34<01:04, 3.33it/s]
35%|β–ˆβ–ˆβ–ˆβ–Œ | 117/332 [00:35<00:55, 3.86it/s]
36%|β–ˆβ–ˆβ–ˆβ–Œ | 118/332 [00:35<00:57, 3.73it/s]
36%|β–ˆβ–ˆβ–ˆβ–Œ | 119/332 [00:35<00:59, 3.60it/s]
36%|β–ˆβ–ˆβ–ˆβ–Œ | 120/332 [00:36<01:08, 3.11it/s]
36%|β–ˆβ–ˆβ–ˆβ–‹ | 121/332 [00:36<01:05, 3.21it/s]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 122/332 [00:36<01:04, 3.27it/s]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 123/332 [00:36<01:02, 3.33it/s]
37%|β–ˆβ–ˆβ–ˆβ–‹ | 124/332 [00:37<00:57, 3.62it/s]
38%|β–ˆβ–ˆβ–ˆβ–Š | 125/332 [00:37<00:52, 3.96it/s]
38%|β–ˆβ–ˆβ–ˆβ–Š | 126/332 [00:37<01:03, 3.22it/s]
38%|β–ˆβ–ˆβ–ˆβ–Š | 127/332 [00:38<01:12, 2.84it/s]
39%|β–ˆβ–ˆβ–ˆβ–Š | 128/332 [00:38<01:17, 2.64it/s]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 129/332 [00:39<01:14, 2.73it/s]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 130/332 [00:39<01:04, 3.16it/s]
39%|β–ˆβ–ˆβ–ˆβ–‰ | 131/332 [00:39<00:59, 3.38it/s]
40%|β–ˆβ–ˆβ–ˆβ–‰ | 132/332 [00:39<00:54, 3.66it/s]
40%|β–ˆβ–ˆβ–ˆβ–ˆ | 133/332 [00:40<01:05, 3.02it/s]
40%|β–ˆβ–ˆβ–ˆβ–ˆ | 134/332 [00:40<01:00, 3.28it/s]
41%|β–ˆβ–ˆβ–ˆβ–ˆ | 135/332 [00:40<00:56, 3.50it/s]
41%|β–ˆβ–ˆβ–ˆβ–ˆ | 136/332 [00:41<01:06, 2.96it/s]
41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 137/332 [00:41<01:06, 2.94it/s]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 138/332 [00:41<01:00, 3.19it/s]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 139/332 [00:42<00:59, 3.25it/s]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 140/332 [00:42<01:06, 2.88it/s]
42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 141/332 [00:42<01:12, 2.62it/s]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 142/332 [00:43<01:07, 2.84it/s]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 143/332 [00:43<01:02, 3.00it/s]
43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 144/332 [00:43<00:57, 3.26it/s]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 145/332 [00:44<01:04, 2.89it/s]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 146/332 [00:44<01:09, 2.67it/s]
44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 147/332 [00:44<01:00, 3.04it/s]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 148/332 [00:45<01:01, 3.00it/s]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 149/332 [00:45<00:54, 3.36it/s]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 150/332 [00:45<00:59, 3.05it/s]
45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 151/332 [00:46<00:57, 3.16it/s]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 152/332 [00:46<00:58, 3.09it/s]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 153/332 [00:46<01:04, 2.78it/s]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 154/332 [00:47<01:08, 2.61it/s]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 155/332 [00:47<01:00, 2.91it/s]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 156/332 [00:47<00:53, 3.26it/s]
47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 157/332 [00:48<00:52, 3.30it/s]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 158/332 [00:48<00:51, 3.36it/s]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 159/332 [00:48<00:49, 3.52it/s]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 160/332 [00:49<00:55, 3.08it/s]
48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 161/332 [00:49<00:58, 2.90it/s]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 162/332 [00:49<01:03, 2.67it/s]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 163/332 [00:50<00:56, 2.97it/s]
49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 164/332 [00:50<00:47, 3.53it/s]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 165/332 [00:50<00:55, 3.03it/s]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 166/332 [00:51<00:50, 3.28it/s]
50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 167/332 [00:51<00:46, 3.58it/s]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 168/332 [00:51<00:53, 3.04it/s]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 169/332 [00:52<00:59, 2.75it/s]
51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 170/332 [00:52<00:53, 3.04it/s]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 171/332 [00:52<00:47, 3.41it/s]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 172/332 [00:52<00:42, 3.74it/s]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 173/332 [00:53<00:41, 3.83it/s]
52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 174/332 [00:53<00:44, 3.53it/s]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 175/332 [00:53<00:46, 3.35it/s]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 176/332 [00:53<00:43, 3.56it/s]
53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 177/332 [00:54<00:50, 3.10it/s]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 178/332 [00:54<00:46, 3.31it/s]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 179/332 [00:54<00:45, 3.37it/s]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 180/332 [00:55<00:42, 3.55it/s]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 181/332 [00:55<00:42, 3.53it/s]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 182/332 [00:55<00:36, 4.08it/s]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 183/332 [00:55<00:34, 4.29it/s]
55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 184/332 [00:56<00:37, 3.98it/s]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 185/332 [00:56<00:40, 3.60it/s]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 186/332 [00:56<00:48, 2.98it/s]
56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 187/332 [00:57<00:46, 3.11it/s]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 188/332 [00:57<00:41, 3.50it/s]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 189/332 [00:57<00:43, 3.28it/s]
57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 190/332 [00:57<00:39, 3.58it/s]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 191/332 [00:58<00:45, 3.09it/s]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 192/332 [00:58<00:46, 3.02it/s]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 193/332 [00:59<00:44, 3.11it/s]
58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 194/332 [00:59<00:43, 3.20it/s]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 195/332 [00:59<00:39, 3.43it/s]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 196/332 [00:59<00:37, 3.60it/s]
59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 197/332 [01:00<00:34, 3.86it/s]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 198/332 [01:00<00:42, 3.17it/s]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 199/332 [01:00<00:40, 3.26it/s]
60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 200/332 [01:00<00:37, 3.56it/s]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 201/332 [01:01<00:40, 3.27it/s]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 202/332 [01:01<00:44, 2.89it/s]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 203/332 [01:01<00:39, 3.29it/s]
61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 204/332 [01:02<00:42, 3.00it/s]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 205/332 [01:02<00:40, 3.13it/s]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 206/332 [01:02<00:38, 3.25it/s]
62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 207/332 [01:03<00:37, 3.30it/s]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 208/332 [01:03<00:35, 3.50it/s]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 209/332 [01:03<00:33, 3.64it/s]
63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 210/332 [01:03<00:32, 3.76it/s]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 211/332 [01:04<00:37, 3.23it/s]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 212/332 [01:04<00:41, 2.86it/s]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 213/332 [01:05<00:36, 3.22it/s]
64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 214/332 [01:05<00:37, 3.12it/s]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 215/332 [01:05<00:36, 3.19it/s]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 216/332 [01:05<00:30, 3.76it/s]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 217/332 [01:06<00:29, 3.84it/s]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 218/332 [01:06<00:32, 3.51it/s]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 219/332 [01:06<00:31, 3.63it/s]
66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 220/332 [01:06<00:28, 3.89it/s]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 221/332 [01:07<00:28, 3.93it/s]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 222/332 [01:07<00:33, 3.33it/s]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 223/332 [01:07<00:31, 3.47it/s]
67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 224/332 [01:08<00:36, 2.99it/s]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 225/332 [01:08<00:34, 3.14it/s]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 226/332 [01:08<00:33, 3.21it/s]
68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 227/332 [01:09<00:31, 3.28it/s]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 228/332 [01:09<00:27, 3.83it/s]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 229/332 [01:09<00:32, 3.18it/s]
69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 230/332 [01:10<00:33, 3.09it/s]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 231/332 [01:10<00:29, 3.46it/s]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 232/332 [01:10<00:26, 3.78it/s]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 233/332 [01:10<00:24, 3.98it/s]
70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 234/332 [01:11<00:30, 3.17it/s]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 235/332 [01:11<00:29, 3.27it/s]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 236/332 [01:11<00:26, 3.62it/s]
71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 237/332 [01:11<00:24, 3.87it/s]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 238/332 [01:12<00:26, 3.54it/s]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 239/332 [01:12<00:26, 3.49it/s]
72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 240/332 [01:12<00:27, 3.30it/s]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 241/332 [01:13<00:24, 3.69it/s]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 242/332 [01:13<00:29, 3.09it/s]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 243/332 [01:13<00:24, 3.63it/s]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 244/332 [01:13<00:24, 3.60it/s]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 245/332 [01:14<00:24, 3.55it/s]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 246/332 [01:14<00:28, 3.04it/s]
74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 247/332 [01:14<00:26, 3.27it/s]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 248/332 [01:15<00:29, 2.89it/s]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 249/332 [01:15<00:26, 3.17it/s]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 250/332 [01:15<00:23, 3.55it/s]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 251/332 [01:16<00:23, 3.52it/s]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 252/332 [01:16<00:22, 3.52it/s]
76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 253/332 [01:16<00:24, 3.25it/s]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 254/332 [01:17<00:27, 2.82it/s]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 255/332 [01:17<00:28, 2.67it/s]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 256/332 [01:17<00:24, 3.05it/s]
77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 257/332 [01:18<00:27, 2.76it/s]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 258/332 [01:18<00:25, 2.94it/s]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 259/332 [01:18<00:23, 3.08it/s]
78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 260/332 [01:19<00:21, 3.34it/s]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 261/332 [01:19<00:22, 3.19it/s]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 262/332 [01:19<00:20, 3.38it/s]
79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 263/332 [01:20<00:21, 3.25it/s]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 264/332 [01:20<00:21, 3.15it/s]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 265/332 [01:20<00:18, 3.55it/s]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 266/332 [01:21<00:21, 3.02it/s]
80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 267/332 [01:21<00:23, 2.78it/s]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 268/332 [01:21<00:21, 2.97it/s]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 269/332 [01:22<00:19, 3.22it/s]
81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 270/332 [01:22<00:18, 3.43it/s]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 271/332 [01:22<00:20, 3.00it/s]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 272/332 [01:22<00:17, 3.34it/s]
82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 273/332 [01:23<00:19, 3.04it/s]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 274/332 [01:23<00:18, 3.16it/s]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 275/332 [01:23<00:16, 3.50it/s]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 276/332 [01:24<00:15, 3.61it/s]
83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 277/332 [01:24<00:16, 3.37it/s]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 278/332 [01:24<00:15, 3.55it/s]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 279/332 [01:24<00:13, 3.86it/s]
84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 280/332 [01:25<00:13, 3.93it/s]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 281/332 [01:25<00:15, 3.23it/s]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 282/332 [01:25<00:16, 3.12it/s]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 283/332 [01:26<00:15, 3.19it/s]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 284/332 [01:26<00:15, 3.11it/s]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 285/332 [01:26<00:14, 3.33it/s]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 286/332 [01:26<00:12, 3.61it/s]
86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 287/332 [01:27<00:14, 3.11it/s]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 288/332 [01:27<00:12, 3.50it/s]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 289/332 [01:27<00:11, 3.63it/s]
87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 290/332 [01:28<00:10, 3.90it/s]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 291/332 [01:28<00:10, 3.74it/s]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 292/332 [01:28<00:12, 3.14it/s]
88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 293/332 [01:29<00:11, 3.38it/s]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 294/332 [01:29<00:12, 3.06it/s]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 295/332 [01:29<00:11, 3.16it/s]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 296/332 [01:30<00:11, 3.08it/s]
89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 297/332 [01:30<00:10, 3.43it/s]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 298/332 [01:30<00:09, 3.56it/s]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 299/332 [01:30<00:10, 3.05it/s]
90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 300/332 [01:31<00:09, 3.27it/s]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 301/332 [01:31<00:08, 3.45it/s]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 302/332 [01:31<00:08, 3.61it/s]
91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 303/332 [01:32<00:08, 3.57it/s]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 304/332 [01:32<00:07, 3.89it/s]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 305/332 [01:32<00:07, 3.44it/s]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 306/332 [01:32<00:07, 3.27it/s]
92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 307/332 [01:33<00:08, 2.88it/s]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 308/332 [01:33<00:09, 2.65it/s]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 309/332 [01:34<00:09, 2.51it/s]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 310/332 [01:34<00:08, 2.74it/s]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 311/332 [01:34<00:06, 3.04it/s]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 312/332 [01:35<00:06, 3.26it/s]
94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 313/332 [01:35<00:06, 2.89it/s]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 314/332 [01:35<00:05, 3.01it/s]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 315/332 [01:36<00:05, 3.23it/s]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 316/332 [01:36<00:04, 3.56it/s]
95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 317/332 [01:36<00:04, 3.69it/s]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 318/332 [01:36<00:03, 3.76it/s]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 319/332 [01:37<00:03, 3.83it/s]
96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 320/332 [01:37<00:02, 4.04it/s]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 321/332 [01:37<00:02, 4.23it/s]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 322/332 [01:37<00:02, 4.19it/s]
97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 323/332 [01:37<00:02, 4.33it/s]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 324/332 [01:38<00:02, 3.74it/s]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 325/332 [01:38<00:01, 3.68it/s]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 326/332 [01:38<00:01, 3.78it/s]
98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 327/332 [01:39<00:01, 3.71it/s]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 328/332 [01:39<00:01, 3.43it/s]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 329/332 [01:39<00:01, 2.97it/s]
99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 330/332 [01:40<00:00, 2.98it/s]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 331/332 [01:40<00:00, 3.54it/s]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 332/332 [01:40<00:00, 3.03it/s]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 332/332 [01:40<00:00, 3.29it/s]
***** eval metrics *****
epoch = 1.0
eval_loss = 0.2787
eval_runtime = 0:01:41.25
eval_samples_per_second = 6.558
eval_steps_per_second = 3.279
[INFO|modelcard.py:452] 2024-01-18 19:19:37,075 >> Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}