--- base_model: upstage/SOLAR-10.7B-v1.0 tags: - generated_from_trainer metrics: - accuracy model-index: - name: solar_10.7_darulm_unigram_proj_init_8node_darulm_part1_v3_1.0_512_12_02_24 results: [] --- # solar_10.7_darulm_unigram_proj_init_8node_darulm_part1_v3_1.0_512_12_02_24 This model is a fine-tuned version of [../solar_darulm_unigram_proj_init_17_01_24](https://huggingface.co/../solar_darulm_unigram_proj_init_17_01_24) on the None dataset. It achieves the following results on the evaluation set: - Loss: 2.3397 - Accuracy: 0.5164 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - distributed_type: multi-GPU - num_devices: 16 - gradient_accumulation_steps: 8 - total_train_batch_size: 128 - total_eval_batch_size: 16 - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05 - lr_scheduler_type: linear - num_epochs: 1.0 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | |:-------------:|:-----:|:-----:|:---------------:|:--------:| | 2.6722 | 0.01 | 500 | 2.4811 | 0.4951 | | 2.6243 | 0.02 | 1000 | 2.4459 | 0.4999 | | 2.6051 | 0.04 | 1500 | 2.4295 | 0.5025 | | 2.5901 | 0.05 | 2000 | 2.4194 | 0.5037 | | 2.5852 | 0.06 | 2500 | 2.4124 | 0.5049 | | 2.5818 | 0.07 | 3000 | 2.4072 | 0.5054 | | 2.5801 | 0.09 | 3500 | 2.4024 | 0.5059 | | 2.5626 | 0.1 | 4000 | 2.3988 | 0.5070 | | 2.5697 | 0.11 | 4500 | 2.3958 | 0.5073 | | 2.5532 | 0.12 | 5000 | 2.3928 | 0.5079 | | 2.5505 | 0.13 | 5500 | 2.3904 | 0.5080 | | 2.5497 | 0.15 | 6000 | 2.3872 | 0.5086 | | 2.5636 | 0.16 | 6500 | 2.3857 | 0.5089 | | 2.5483 | 0.17 | 7000 | 2.3835 | 0.5092 | | 2.5505 | 0.18 | 7500 | 2.3813 | 0.5097 | | 2.5419 | 0.2 | 8000 | 2.3796 | 0.5096 | | 2.5467 | 0.21 | 8500 | 2.3786 | 0.5099 | | 2.5419 | 0.22 | 9000 | 2.3769 | 0.5102 | | 2.5269 | 0.23 | 9500 | 2.3754 | 0.5105 | | 2.5315 | 0.24 | 10000 | 2.3740 | 0.5106 | | 2.5442 | 0.26 | 10500 | 2.3728 | 0.5108 | | 2.5318 | 0.27 | 11000 | 2.3713 | 0.5112 | | 2.5242 | 0.28 | 11500 | 2.3702 | 0.5113 | | 2.5178 | 0.29 | 12000 | 2.3698 | 0.5112 | | 2.5345 | 0.31 | 12500 | 2.3687 | 0.5114 | | 2.531 | 0.32 | 13000 | 2.3675 | 0.5115 | | 2.5304 | 0.33 | 13500 | 2.3661 | 0.5118 | | 2.5264 | 0.34 | 14000 | 2.3653 | 0.5121 | | 2.5281 | 0.35 | 14500 | 2.3647 | 0.5123 | | 2.5259 | 0.37 | 15000 | 2.3636 | 0.5123 | | 2.5075 | 0.38 | 15500 | 2.3629 | 0.5122 | | 2.5147 | 0.39 | 16000 | 2.3621 | 0.5127 | | 2.5137 | 0.4 | 16500 | 2.3611 | 0.5128 | | 2.5206 | 0.42 | 17000 | 2.3603 | 0.5129 | | 2.5153 | 0.43 | 17500 | 2.3597 | 0.5128 | | 2.5184 | 0.44 | 18000 | 2.3590 | 0.5130 | | 2.5104 | 0.45 | 18500 | 2.3581 | 0.5132 | | 2.5085 | 0.46 | 19000 | 2.3577 | 0.5134 | | 2.509 | 0.48 | 19500 | 2.3572 | 0.5135 | | 2.5143 | 0.49 | 20000 | 2.3564 | 0.5135 | | 2.5124 | 0.5 | 20500 | 2.3555 | 0.5137 | | 2.5107 | 0.51 | 21000 | 2.3546 | 0.5139 | | 2.5034 | 0.53 | 21500 | 2.3543 | 0.5140 | | 2.4922 | 0.54 | 22000 | 2.3538 | 0.5139 | | 2.514 | 0.55 | 22500 | 2.3532 | 0.5140 | | 2.5199 | 0.56 | 23000 | 2.3527 | 0.5141 | | 2.4926 | 0.57 | 23500 | 2.3521 | 0.5142 | | 2.5104 | 0.59 | 24000 | 2.3517 | 0.5142 | | 2.5067 | 0.6 | 24500 | 2.3511 | 0.5144 | | 2.5055 | 0.61 | 25000 | 2.3508 | 0.5142 | | 2.5011 | 0.62 | 25500 | 2.3502 | 0.5146 | | 2.4931 | 0.64 | 26000 | 2.3496 | 0.5147 | | 2.4965 | 0.65 | 26500 | 2.3491 | 0.5147 | | 2.495 | 0.66 | 27000 | 2.3488 | 0.5146 | | 2.5051 | 0.67 | 27500 | 2.3481 | 0.5150 | | 2.51 | 0.68 | 28000 | 2.3478 | 0.5150 | | 2.4883 | 0.7 | 28500 | 2.3474 | 0.5152 | | 2.4973 | 0.71 | 29000 | 2.3470 | 0.5151 | | 2.4939 | 0.72 | 29500 | 2.3464 | 0.5153 | | 2.4952 | 0.73 | 30000 | 2.3461 | 0.5153 | | 2.5028 | 0.75 | 30500 | 2.3459 | 0.5154 | | 2.4979 | 0.76 | 31000 | 2.3454 | 0.5154 | | 2.4928 | 0.77 | 31500 | 2.3450 | 0.5155 | | 2.501 | 0.78 | 32000 | 2.3446 | 0.5156 | | 2.5 | 0.79 | 32500 | 2.3443 | 0.5156 | | 2.4865 | 0.81 | 33000 | 2.3438 | 0.5156 | | 2.4898 | 0.82 | 33500 | 2.3434 | 0.5157 | | 2.4977 | 0.83 | 34000 | 2.3430 | 0.5160 | | 2.4904 | 0.84 | 34500 | 2.3427 | 0.5157 | | 2.4779 | 0.86 | 35000 | 2.3424 | 0.5159 | | 2.4792 | 0.87 | 35500 | 2.3420 | 0.5159 | | 2.4931 | 0.88 | 36000 | 2.3419 | 0.5160 | | 2.4997 | 0.89 | 36500 | 2.3416 | 0.5160 | | 2.4986 | 0.9 | 37000 | 2.3414 | 0.5161 | | 2.4965 | 0.92 | 37500 | 2.3411 | 0.5162 | | 2.4743 | 0.93 | 38000 | 2.3409 | 0.5162 | | 2.497 | 0.94 | 38500 | 2.3406 | 0.5163 | | 2.4942 | 0.95 | 39000 | 2.3404 | 0.5162 | | 2.4907 | 0.97 | 39500 | 2.3402 | 0.5163 | | 2.4821 | 0.98 | 40000 | 2.3400 | 0.5163 | | 2.4857 | 0.99 | 40500 | 2.3398 | 0.5163 | ### Framework versions - Transformers 4.37.2 - Pytorch 2.1.2 - Datasets 2.16.1 - Tokenizers 0.15.2