UnifiedQA-Reddit-SYAC
This is an abstractive title answering (TA) / clickbait spoiling model.
This is a variant of allenai/unifiedqa-t5-large, fine-tuned on the Reddit SYAC dataset.
The model was trained as part of my masters thesis:
Abstractive title answering for clickbait content
Disinformation
This model has the proven capability of generating, and hallucinating false information.
Any use of a TA system such as this one should be with knowledge of this risk.
Performance
Intrinsic
The following scores is the result of intrinsic evaluation on the Reddit SYAC test set.
We used a max input length of 2048 and truncated the tokens exceeding this limit.
rouge1 | rouge2 | rougeL | bleu | meteor |
---|---|---|---|---|
44.58 | 23.89 | 43.45 | 17.46 | 36.22 |
Qualtiy
Using human evaluation, we measured model performance by asking the evaluators to rate the models on a scale from 1 to 5 on how good their generated answer was for a given clickbait article.
Mean quality = 4.065
Factuality
We included a factuality assessment to address the issue of generating false information.
Human raters were asked to place each output in the categories "True", "Irrelevant", and "False".
True | Irrelevant | False |
---|---|---|
85% | 7.5% | 7.5% |
Cite
If you use this model, please cite my master's thesis
@mastersthesis{heiervang2022AbstractiveTA
title={Abstractive title answering for clickbait content},
author={Markus Sverdvik Heiervang},
publisher={University of Oslo, Department of Informatics},
year={2022}
}
- Downloads last month
- 10