--- library_name: transformers language: - de base_model: - FacebookAI/xlm-roberta-large pipeline_tag: token-classification --- # Model Card for Model ID We fine-tuned our base model for 71 epochs on the Ca dataset, epoch 61 showed the best macro average f1 score on the evaluation dataset. ## Metrics eval_AVGf1 0.8073040334161414 eval_DIAGNOSIS.f1 0.8044417026526834 eval_DIAGNOSIS.precision 0.7774244833068362 eval_DIAGNOSIS.recall 0.8334043459735833 eval_DIAGNOSTIC.f1 0.8154647655607348 eval_DIAGNOSTIC.precision 0.7876059322033898 eval_DIAGNOSTIC.recall 0.8453666856168277 eval_DRUG.f1 0.9283865401207938 eval_DRUG.precision 0.911864406779661 eval_DRUG.recall 0.945518453427065 eval_MEDICAL_FINDING.f1 0.7855789872458644 eval_MEDICAL_FINDING.precision 0.7687839841819081 eval_MEDICAL_FINDING.recall 0.8031241931319391 eval_THERAPY.f1 0.7026481715006304 eval_THERAPY.precision 0.6716489874638379 eval_THERAPY.recall 0.7366472765732417 eval_accuracy 0.9359328085693419 eval_f1 0.7922039763638145 eval_loss 0.6178462505340576 eval_precision 0.7703492063492063 eval_recall 0.8153349909280291 eval_runtime 107.4969 eval_samples_per_second 76.114 eval_steps_per_second 9.517 test_AVGf1 0.7654950023468019 test_DIAGNOSIS.f1 0.7317784256559767 test_DIAGNOSIS.precision 0.7442550037064493 test_DIAGNOSIS.recall 0.7197132616487455 test_DIAGNOSTIC.f1 0.7815242494226328 test_DIAGNOSTIC.precision 0.7779310344827586 test_DIAGNOSTIC.recall 0.7851508120649652 test_DRUG.f1 0.9199594731509625 test_DRUG.precision 0.9013898080741231 test_DRUG.recall 0.9393103448275862 test_MEDICAL_FINDING.f1 0.7348673770120154 test_MEDICAL_FINDING.precision 0.6987497305453761 test_MEDICAL_FINDING.recall 0.7749223045661009 test_THERAPY.f1 0.6593454864924225 test_THERAPY.precision 0.6414529914529915 test_THERAPY.recall 0.6782647989154993 test_accuracy 0.9244002381348251 test_f1 0.7459972552607502 test_loss 0.7649919986724854 test_precision 0.72469725586046 test_recall 0.7685872510899022 test_runtime 124.1668 test_samples_per_second 76.421 test_steps_per_second 9.56