Kamyar-zeinalipour's picture
Upload MistralForCausalLM
2329d3c verified
metadata
base_model: mistralai/Mistral-7B-v0.3
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: mistral-7b-drug-prots
    results: []

mistral-7b-drug-prots

This model is a fine-tuned version of mistralai/Mistral-7B-v0.3 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5457

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 30
  • training_steps: 5300

Training results

Training Loss Epoch Step Validation Loss
1.7818 0.0094 50 1.6715
1.7216 0.0189 100 1.5833
1.6278 0.0283 150 1.5331
1.5849 0.0377 200 1.4866
1.6059 0.0472 250 1.4766
1.6047 0.0566 300 1.4635
1.5167 0.0660 350 1.4515
1.4995 0.0755 400 1.4386
1.5051 0.0849 450 1.4332
1.4858 0.0943 500 1.4210
1.5011 0.1038 550 1.4051
1.497 0.1132 600 1.4005
1.5202 0.1226 650 1.3932
1.5204 0.1321 700 1.3880
1.508 0.1415 750 1.3826
1.4552 0.1509 800 1.3753
1.4866 0.1604 850 1.3706
1.4661 0.1698 900 1.3694
1.4661 0.1792 950 1.3622
1.3875 0.1887 1000 1.3589
1.4471 0.1981 1050 1.3518
1.429 0.2075 1100 1.3390
1.4181 0.2170 1150 1.3365
1.39 0.2264 1200 1.3376
1.4067 0.2358 1250 1.3354
1.4017 0.2453 1300 1.3382
1.3842 0.2547 1350 1.3257
1.4398 0.2642 1400 1.3160
1.3642 0.2736 1450 1.3222
1.3647 0.2830 1500 1.3217
1.4066 0.2925 1550 1.3102
1.4094 0.3019 1600 1.3109
1.3473 0.3113 1650 1.3075
1.3645 0.3208 1700 1.3085
1.3318 0.3302 1750 1.2962
1.3562 0.3396 1800 1.2929
1.3539 0.3491 1850 1.2837
1.3587 0.3585 1900 1.2828
1.3827 0.3679 1950 1.2776
1.3335 0.3774 2000 1.2757
1.3663 0.3868 2050 1.2732
1.2937 0.3962 2100 1.2625
1.3318 0.4057 2150 1.2593
1.2886 0.4151 2200 1.2524
1.3033 0.4245 2250 1.2527
1.2531 0.4340 2300 1.2428
1.2568 0.4434 2350 1.2508
1.2573 0.4528 2400 1.2437
1.2364 0.4623 2450 1.2299
1.2111 0.4717 2500 1.2307
1.2016 0.4811 2550 1.2277
1.236 0.4906 2600 1.2182
1.1858 0.5 2650 1.2237
1.218 0.5094 2700 1.2161
1.1693 0.5189 2750 1.2247
1.1455 0.5283 2800 1.2277
1.1555 0.5377 2850 1.2305
1.162 0.5472 2900 1.2253
1.0834 0.5566 2950 1.2326
1.0964 0.5660 3000 1.2397
1.038 0.5755 3050 1.2370
1.0338 0.5849 3100 1.2477
1.0359 0.5943 3150 1.2390
0.9861 0.6038 3200 1.2547
1.008 0.6132 3250 1.2666
1.0275 0.6226 3300 1.2495
0.9443 0.6321 3350 1.2691
0.8923 0.6415 3400 1.2893
0.9118 0.6509 3450 1.2943
0.8411 0.6604 3500 1.2870
0.8356 0.6698 3550 1.2971
0.8326 0.6792 3600 1.3030
0.8053 0.6887 3650 1.3147
0.7921 0.6981 3700 1.3235
0.7563 0.7075 3750 1.3290
0.7223 0.7170 3800 1.3460
0.7157 0.7264 3850 1.3525
0.7539 0.7358 3900 1.3396
0.6838 0.7453 3950 1.3617
0.7088 0.7547 4000 1.3477
0.6409 0.7642 4050 1.3850
0.6083 0.7736 4100 1.3883
0.594 0.7830 4150 1.4017
0.5721 0.7925 4200 1.4264
0.5144 0.8019 4250 1.4292
0.494 0.8113 4300 1.4427
0.4591 0.8208 4350 1.4588
0.4711 0.8302 4400 1.4627
0.4668 0.8396 4450 1.4641
0.4409 0.8491 4500 1.4778
0.4487 0.8585 4550 1.4821
0.4816 0.8679 4600 1.4711
0.4293 0.8774 4650 1.5048
0.4126 0.8868 4700 1.5079
0.4284 0.8962 4750 1.5040
0.3911 0.9057 4800 1.5293
0.3883 0.9151 4850 1.5293
0.3862 0.9245 4900 1.5243
0.3937 0.9340 4950 1.5440
0.3836 0.9434 5000 1.5389
0.3827 0.9528 5050 1.5437
0.3698 0.9623 5100 1.5545
0.383 0.9717 5150 1.5394
0.401 0.9811 5200 1.5400
0.4024 0.9906 5250 1.5409
0.4305 1.0 5300 1.5457

Framework versions

  • Transformers 4.44.0.dev0
  • Pytorch 2.1.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1