|
--- |
|
datasets: |
|
- squad_v2 |
|
language: |
|
- en |
|
license: apache-2.0 |
|
inference: |
|
parameters: |
|
handle_impossible_answer: true |
|
--- |
|
# Model Card for etweedy/roberta-base-squad-v2 |
|
|
|
An instance of [roberta-base for QA](https://huggingface.co/roberta-base) which was fine-tuned for context-based question answering on the [SQuAD v2 dataset](https://huggingface.co/datasets/squad_v2), a dataset of English-language context-question-answer triples designed for extractive question answering training and benchmarking. Version 2 of SQuAD (Stanford Question Answering Dataset) contains the 100,000 examples from SQuAD Version 1.1, along with 50,000 additional "unanswerable" questions, i.e. questions whose answer cannot be found in the provided context. |
|
|
|
The original RoBERTa (Robustly Optimized BERT Pretraining Approach) model was introduced in [this paper](https://arxiv.org/abs/1907.11692) and [this repository](https://github.com/facebookresearch/fairseq/tree/main/examples/roberta) |
|
|
|
## Demonstration space |
|
Try out inference on this model using [this app](https://huggingface.co/spaces/etweedy/roberta-squad-v2) |
|
|
|
## Overview |
|
**Pretrained model:** [roberta-base](https://huggingface.co/roberta-base) |
|
**Language:** English |
|
**Downstream-task:** Extractive QA |
|
**Training data:** [SQuAD v2](https://huggingface.co/datasets/squad_v2) train split |
|
**Eval data:** [SQuAD v2](https://huggingface.co/datasets/squad_v2) validation split |
|
|
|
## How to Get Started with the Model |
|
Initializing pipeline: |
|
```python |
|
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline |
|
repo_id = "etweedy/roberta-base-squad-v2" |
|
QA_pipeline = pipeline( |
|
task = 'question-answering', |
|
model=repo_id, |
|
tokenizer=repo_id, |
|
handle_impossible_answer = True |
|
) |
|
``` |
|
Inference: |
|
```python |
|
input = { |
|
'question': 'Who invented Twinkies?', |
|
'context': 'Twinkies were invented on April 6, 1930, by Canadian-born baker James Alexander Dewar for the Continental Baking Company in Schiller Park, Illinois.' |
|
} |
|
response = QA_pipeline(**input) |
|
``` |
|
|
|
### Training Hyperparameters |
|
|
|
``` |
|
batch_size = 16 |
|
n_epochs = 3 |
|
learning_rate = 3e-5 |
|
base_LM_model = ["roberta-base"](https://huggingface.co/roberta-base) |
|
max_seq_len = 384 |
|
stride=128 |
|
lr_schedule = LinearWarmup |
|
warmup_proportion = 0.0 |
|
mixed_precision="fp16" |
|
``` |
|
|
|
## Evaluation results |
|
|
|
The model was evaluated on the validation split of [SQuAD v2](https://huggingface.co/datasets/squad_v2) and attained the following results: |
|
|
|
```python |
|
{"exact": 79.87029394424324, |
|
"f1": 82.91251169582613, |
|
"total": 11873, |
|
"HasAns_exact": 77.93522267206478, |
|
"HasAns_f1": 84.02838248389763, |
|
"HasAns_total": 5928, |
|
"NoAns_exact": 81.79983179142137, |
|
"NoAns_f1": 81.79983179142137, |
|
"NoAns_total": 5945} |
|
``` |
|
|
|
**BibTeX base model citation:** |
|
|
|
```bibtex |
|
@article{DBLP:journals/corr/abs-1907-11692, |
|
author = {Yinhan Liu and |
|
Myle Ott and |
|
Naman Goyal and |
|
Jingfei Du and |
|
Mandar Joshi and |
|
Danqi Chen and |
|
Omer Levy and |
|
Mike Lewis and |
|
Luke Zettlemoyer and |
|
Veselin Stoyanov}, |
|
title = {RoBERTa: {A} Robustly Optimized {BERT} Pretraining Approach}, |
|
journal = {CoRR}, |
|
volume = {abs/1907.11692}, |
|
year = {2019}, |
|
url = {http://arxiv.org/abs/1907.11692}, |
|
archivePrefix = {arXiv}, |
|
eprint = {1907.11692}, |
|
timestamp = {Thu, 01 Aug 2019 08:59:33 +0200}, |
|
biburl = {https://dblp.org/rec/journals/corr/abs-1907-11692.bib}, |
|
bibsource = {dblp computer science bibliography, https://dblp.org} |
|
} |
|
``` |
|
|