Cannot reproduce the accuracy result.
Hi, experts.
I am new to Huggingface. I am trying to reproduce the fine-tuning result, but I cannot achieve the indicated accuracy.
I am using run_glue.py
under transformers/examples/pytorch/text-classification
to do the finetuning. Specifically, I am passing the following json.
The hyperparams are from here: https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english
{
"model_name_or_path": "distilbert-base-uncased",
"task_name": "sst2",
"do_train": true,
"do_eval": true,
"max_seq_length": 128,
"per_device_train_batch_size": 32,
"learning_rate": 1e-5,
"num_train_epochs": 3,
"warmup_steps": 600,
"output_dir": "/scratch/sst2_checkpoints"
}
However, the final accuracy I get after training is 89.68%. It is not bad, but it is lower than 91.3% that is indicated here: https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english. Not sure what I am doing wrong. Can someone help me understand why my accuracy is not reaching 91.3%?
Also, at the same webpage, on the right side, it says the accuracy with glue is 91.1% and the accuracy with sst2 is 98.9%, which I am not sure what it means (I thought sst2 was part of the glue dataset). What are these numbers and why are they still different than 91.3%?
Any help would be really appreciated.
Thank you.