File size: 3,126 Bytes
e640c2b 4c87f25 e640c2b 14bbcab e640c2b 12dbdb8 5051a9d 12dbdb8 e640c2b 12dbdb8 e640c2b 12dbdb8 e640c2b 12dbdb8 e640c2b 12dbdb8 e640c2b 0bcd335 e640c2b 9cfe9ef e640c2b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
---
language: "english"
license: "mit"
datasets:
- race
metrics:
- accuracy
---
# Roberta Large Fine Tuned on RACE
## Model description
This model is a fine-tuned model of Roberta-large applied on RACE
#### How to use
```python
import datasets
from transformers import RobertaTokenizer
from transformers import RobertaForMultipleChoice
tokenizer = RobertaTokenizer.from_pretrained(
"LIAMF-USP/roberta-large-finetuned-race")
model = RobertaForMultipleChoice.from_pretrained(
"LIAMF-USP/roberta-large-finetuned-race")
dataset = datasets.load_dataset(
"race",
"all",
split=["train", "validation", "test"],
)training_examples = dataset[0]
evaluation_examples = dataset[1]
test_examples = dataset[2]
example=training_examples[0]
example_id = example["example_id"]
question = example["question"]
context = example["article"]
options = example["options"]
label_example = example["answer"]
label_map = {label: i
for i, label in enumerate(["A", "B", "C", "D"])}
choices_inputs = []
for ending_idx, (_, ending) in enumerate(
zip(context, options)):
if question.find("_") != -1:
# fill in the banks questions
question_option = question.replace("_", ending)
else:
question_option = question + " " + ending
inputs = tokenizer(
context,
question_option,
add_special_tokens=True,
max_length=MAX_SEQ_LENGTH,
padding="max_length",
truncation=True,
return_overflowing_tokens=False,
)
label = label_map[label_example]
input_ids = [x["input_ids"] for x in choices_inputs]
attention_mask = (
[x["attention_mask"] for x in choices_inputs]
# as the senteces follow the same structure,
#just one of them is necessary to check
if "attention_mask" in choices_inputs[0]
else None
)
example_encoded = {
"example_id": example_id,
"input_ids": input_ids,
"attention_mask": attention_mask,
"label": label,
}
output = model(**example_encoded)
```
## Training data
The initial model was [roberta large model](https://huggingface.co/roberta-large) which was then fine-tuned on [RACE dataset](https://www.cs.cmu.edu/~glai1/data/race/)
## Training procedure
It was necessary to preprocess the data with a method that is exemplified for a single instance in the _How to use_ section. The used hyperparameters were the following:
| Hyperparameter | Value |
|:----:|:----:|
| adam_beta1 | 0.9 |
| adam_beta2 | 0.98 |
| adam_epsilon | 1.000e-8 |
| eval_batch_size | 32 |
| train_batch_size | 1 |
| fp16 | True |
| gradient_accumulation_steps | 16 |
| learning_rate | 0.00001 |
| warmup_steps | 1000 |
| max_length | 512 |
| epochs | 4 |
## Eval results:
| Dataset Acc | Eval | All Test |High School Test |Middle School Test |
|:----:|:----:|:----:|:----:|:----:|
| | 85.2 | 84.9|83.5|88.0|
**The model was trained with a Tesla V100-PCIE-16GB** |