finemath-classifier / README.md
anton-l's picture
anton-l HF staff
Model save
c2b08c8 verified
|
raw
history blame
8.54 kB
metadata
library_name: transformers
license: mit
base_model: intfloat/multilingual-e5-small
tags:
  - generated_from_trainer
metrics:
  - precision
  - recall
  - accuracy
model-index:
  - name: owm-math-scorer-multilingual-e5-small
    results: []

owm-math-scorer-multilingual-e5-small

This model is a fine-tuned version of intfloat/multilingual-e5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4478
  • Precision: 0.8771
  • Recall: 0.8769
  • F1 Macro: 0.8770
  • Accuracy: 0.8770

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 256
  • eval_batch_size: 128
  • seed: 0
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 1024
  • total_eval_batch_size: 512
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Macro Accuracy
No log 0 0 8.2909 0.2546 0.5 0.3374 0.5091
0.5633 0.2844 250 0.5661 0.8608 0.8596 0.8588 0.8589
0.5443 0.5688 500 0.5192 0.8655 0.8652 0.8653 0.8654
0.5395 0.8532 750 0.5461 0.8651 0.8636 0.8628 0.8629
0.5144 1.1377 1000 0.4992 0.8691 0.8692 0.8691 0.8691
0.5278 1.4221 1250 0.5322 0.8675 0.8613 0.8616 0.8624
0.501 1.7065 1500 0.4942 0.8708 0.8690 0.8692 0.8695
0.4942 1.9909 1750 0.4934 0.8697 0.8696 0.8693 0.8693
0.492 2.2753 2000 0.4873 0.8710 0.8711 0.8711 0.8711
0.4984 2.5597 2250 0.5061 0.8701 0.8694 0.8688 0.8688
0.4809 2.8441 2500 0.4995 0.8719 0.8673 0.8677 0.8682
0.4744 3.1286 2750 0.4783 0.8721 0.8722 0.8721 0.8721
0.4817 3.4130 3000 0.4715 0.8737 0.8738 0.8738 0.8738
0.4748 3.6974 3250 0.4734 0.8743 0.8725 0.8728 0.8731
0.4725 3.9818 3500 0.4703 0.8738 0.8736 0.8737 0.8738
0.4684 4.2662 3750 0.4693 0.8739 0.8734 0.8735 0.8737
0.4796 4.5506 4000 0.4697 0.8746 0.8727 0.8729 0.8732
0.4666 4.8350 4250 0.4715 0.8737 0.8738 0.8735 0.8735
0.4697 5.1195 4500 0.4853 0.8736 0.8692 0.8695 0.8700
0.466 5.4039 4750 0.4782 0.8734 0.8713 0.8716 0.8719
0.4663 5.6883 5000 0.4653 0.8746 0.8747 0.8746 0.8746
0.4677 5.9727 5250 0.4656 0.8749 0.8734 0.8737 0.8739
0.4615 6.2571 5500 0.4631 0.8753 0.8739 0.8741 0.8743
0.4689 6.5415 5750 0.4610 0.8759 0.8754 0.8756 0.8757
0.4643 6.8259 6000 0.4601 0.8753 0.8747 0.8749 0.8750
0.4591 7.1104 6250 0.4598 0.8748 0.8745 0.8746 0.8747
0.4628 7.3948 6500 0.4592 0.8759 0.8749 0.8751 0.8753
0.4589 7.6792 6750 0.4613 0.8759 0.8744 0.8747 0.8749
0.4626 7.9636 7000 0.4566 0.8758 0.8753 0.8754 0.8756
0.4632 8.2480 7250 0.4623 0.8746 0.8727 0.8730 0.8732
0.4545 8.5324 7500 0.4554 0.8766 0.8759 0.8760 0.8762
0.4596 8.8168 7750 0.4581 0.8755 0.8755 0.8755 0.8755
0.4571 9.1013 8000 0.4595 0.8759 0.8737 0.8740 0.8743
0.4585 9.3857 8250 0.4561 0.8760 0.8750 0.8752 0.8754
0.4541 9.6701 8500 0.4548 0.8756 0.8750 0.8751 0.8752
0.4576 9.9545 8750 0.4541 0.8757 0.8754 0.8755 0.8756
0.449 10.2389 9000 0.4554 0.8754 0.8752 0.8752 0.8753
0.4507 10.5233 9250 0.4535 0.8763 0.8763 0.8763 0.8763
0.4545 10.8077 9500 0.4543 0.8759 0.8758 0.8758 0.8759
0.4462 11.0922 9750 0.4529 0.8764 0.8756 0.8758 0.8759
0.4505 11.3766 10000 0.4538 0.8762 0.8751 0.8753 0.8755
0.4576 11.6610 10250 0.4714 0.8751 0.8714 0.8717 0.8722
0.4509 11.9454 10500 0.4613 0.8759 0.8760 0.8758 0.8758
0.4557 12.2298 10750 0.4538 0.8764 0.8753 0.8755 0.8757
0.4539 12.5142 11000 0.4523 0.8765 0.8758 0.8760 0.8761
0.4534 12.7986 11250 0.4515 0.8766 0.8767 0.8766 0.8767
0.4532 13.0830 11500 0.4509 0.8768 0.8763 0.8765 0.8766
0.4501 13.3675 11750 0.4517 0.8765 0.8762 0.8763 0.8763
0.4493 13.6519 12000 0.4527 0.8767 0.8768 0.8768 0.8768
0.4528 13.9363 12250 0.4499 0.8766 0.8765 0.8765 0.8766
0.4491 14.2207 12500 0.4519 0.8766 0.8755 0.8757 0.8759
0.4495 14.5051 12750 0.4594 0.8768 0.8769 0.8767 0.8767
0.4443 14.7895 13000 0.4519 0.8766 0.8764 0.8765 0.8766
0.4476 15.0739 13250 0.4509 0.8769 0.8766 0.8767 0.8768
0.4466 15.3584 13500 0.4494 0.8773 0.8769 0.8770 0.8771
0.4456 15.6428 13750 0.4489 0.8768 0.8765 0.8766 0.8767
0.4447 15.9272 14000 0.4552 0.8765 0.8751 0.8754 0.8756
0.4471 16.2116 14250 0.4520 0.8763 0.8763 0.8763 0.8763
0.4521 16.4960 14500 0.4509 0.8770 0.8756 0.8758 0.8760
0.4419 16.7804 14750 0.4533 0.8767 0.8768 0.8767 0.8768
0.4485 17.0648 15000 0.4483 0.8770 0.8768 0.8769 0.8769
0.4424 17.3493 15250 0.4490 0.8770 0.8769 0.8769 0.8770
0.4441 17.6337 15500 0.4502 0.8770 0.8769 0.8770 0.8770
0.4487 17.9181 15750 0.4480 0.8769 0.8763 0.8765 0.8766
0.4487 18.2025 16000 0.4500 0.8771 0.8772 0.8772 0.8772
0.4375 18.4869 16250 0.4483 0.8769 0.8766 0.8767 0.8768
0.4491 18.7713 16500 0.4515 0.8768 0.8769 0.8768 0.8768
0.4433 19.0557 16750 0.4477 0.8773 0.8769 0.8770 0.8771
0.4432 19.3402 17000 0.4480 0.8771 0.8769 0.8770 0.8771
0.442 19.6246 17250 0.4480 0.8770 0.8768 0.8769 0.8770
0.4407 19.9090 17500 0.4478 0.8771 0.8769 0.8770 0.8770

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1