File size: 8,301 Bytes
11661d4 636e0ac ce2799c 64a1c47 ce2799c 2637071 b0d5c56 ef0fd81 c869d30 11661d4 ce2799c 11661d4 64a1c47 11661d4 64a1c47 11661d4 64a1c47 11661d4 64a1c47 11661d4 59b9606 11661d4 757095a 819fc0d 757095a 11661d4 cdf2652 59b9606 3246d48 59b9606 cdf2652 59b9606 3246d48 11661d4 59b9606 f397d95 64a1c47 11661d4 64a1c47 b891355 11661d4 8652b82 11661d4 8652b82 11661d4 fcfe744 51c142c fcfe744 11661d4 51c142c 11661d4 51c142c 11661d4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 |
import gradio as gr
from transformers import pipeline
import torch
title = "Extractive QA Biomedicine"
description = """
<p style="text-align: justify;">
Recent research has made available Spanish Language Models trained on Biomedical corpus. This project explores the use of these new models to generate extractive Question Answering models for Biomedicine, and compares their effectiveness with general masked language models.
The models were trained on the <a href="https://huggingface.co/datasets/squad_es">SQUAD_ES Dataset</a> (automatic translation of the Stanford Question Answering Dataset into Spanish). SQUAD v2 version was chosen in order to include questions that cannot be answered based on a provided context.
The models were evaluated on <a href="https://huggingface.co/datasets/hackathon-pln-es/biomed_squad_es_v2">BIOMED_SQUAD_ES_V2 Dataset</a> , a subset of the SQUAD_ES evaluation dataset containing questions related to the Biomedical domain.
The project is aligned with goal number 3 of the <a href="https://www.un.org/sustainabledevelopment/">Sustainable Development Goals</a> promoted by the United Nations: "Ensure a healthy life and promote well-being for all at all ages", since this research can lead to the development of tools that facilitate the access to health information by doctors and Spanish-speaking people from all over the world.
In the following Demo, the four trained models can be tested to answer a question given a context (the confidence score - from 0 to 1 - of the predicted answer is also displayed):
</p>
"""
article = """
<p style="text-align: justify;">
<h3>Results</h3>
<table class="table table-bordered table-hover table-condensed">
<thead><tr><th title="Field #1">Model</th>
<th title="Field #2">Base Model Domain</th>
<th title="Field #3">exact</th>
<th title="Field #4">f1</th>
<th title="Field #5">HasAns_exact</th>
<th title="Field #6">HasAns_f1</th>
<th title="Field #7">NoAns_exact</th>
<th title="Field #8">NoAns_f1</th>
</tr></thead>
<tbody><tr>
<td><a href="https://huggingface.co/hackathon-pln-es/roberta-base-bne-squad2-es">hackathon-pln-es/roberta-base-bne-squad2-es</a></td>
<td>General</td>
<td align="right">67.6341</td>
<td align="right">75.6988</td>
<td align="right">53.7367</td>
<td align="right">70.0526</td>
<td align="right">81.2174</td>
<td align="right">81.2174</td>
</tr>
<tr>
<td><a href="https://huggingface.co/hackathon-pln-es/roberta-base-biomedical-clinical-es-squad2-es">hackathon-pln-es/roberta-base-biomedical-clinical-es-squad2-es</a></td>
<td>Biomedical</td>
<td align="right">66.8426</td>
<td align="right">75.2346</td>
<td align="right">53.0249</td>
<td align="right">70.0031</td>
<td align="right">80.3478</td>
<td align="right">80.3478</td>
</tr>
<tr>
<td><a href="https://huggingface.co/hackathon-pln-es/roberta-base-biomedical-es-squad2-es">hackathon-pln-es/roberta-base-biomedical-es-squad2-es</a></td>
<td>Biomedical</td>
<td align="right">67.6341</td>
<td align="right">74.5612</td>
<td align="right">47.6868</td>
<td align="right">61.7012</td>
<td align="right">87.1304</td>
<td align="right"> 87.1304</td>
</tr>
<tr>
<td><a href="https://huggingface.co/hackathon-pln-es/biomedtra-small-es-squad2-es">hackathon-pln-es/biomedtra-small-es-squad2-es</a></td>
<td>Biomedical</td>
<td align="right">34.4767</td>
<td align="right">44.3294</td>
<td align="right">45.3737</td>
<td align="right">65.307</td>
<td align="right">23.8261</td>
<td align="right">23.8261</td>
</tr>
</tbody></table>
<h3>Challenges</h3>
<li>Question Answering is a complex task to understand, as it requires not only pre-processing the inputs, but also post-processing the outputs. Moreover, the metrics used are quite specific.
<li>There is not as much documentation and tutorials available for QA as for other more popular NLP tasks. In particular, the examples provided are often focused on the SQUAD v1 format and not on SQUAD v2, the format selected for this project.
<li>Before the Hackathon, there was no Biomedical QA dataset in Spanish publicly available (particularly with the SQUAD V2 format). It was necessary to create a validation Biomedical Dataset using the SQUAD_ES Dataset.
</ul>
<h3>Conclusion and Future Work</h3>
If F1 Score is considered, the results show that there may be no advantage in using domain-specific masked language models to generate Biomedical QA models.
However, the F1 Scores reported for the Biomedical roberta-based models are not far below from those of the general roberta-based model.
If only unanswerable questions are taken into account, the model with the best F1 Score is <a href="https://huggingface.co/hackathon-pln-es/roberta-base-biomedical-es-squad2-es">hackathon-pln-es/roberta-base-biomedical-es-squad2-es</a>.
The model <a href="https://huggingface.co/hackathon-pln-es/biomedtra-small-es-squad2-es">hackathon-pln-es/biomedtra-small-es-squad2-es</a>, on the contrary, shows inability to correctly identify unanswerable questions.
As future work, the following experiments could be carried out:
<ul>
<li>Create Biomedical masked-language models adapted from a general model, to preserve words and features of Spanish that are also present in Biomedical questions and articles. The Biomedical base models used in the project were trained from scratch from a Biomedical corpus.
<li>Create a Biomedical training dataset with SQUAD v2 format.
<li>Generate a new and larger Spanish Biomedical validation dataset, not translated from English as in the case of SQUAD_ES Dataset.
<li>Ensamble different models.
</ul>
</p>
<h3>Author</h3>
<a href="https://www.linkedin.com/in/santiagomaximocaram/">Santiago Maximo</a>
"""
device = 0 if torch.cuda.is_available() else -1
MODEL_NAMES = ["hackathon-pln-es/roberta-base-bne-squad2-es",
"hackathon-pln-es/roberta-base-biomedical-clinical-es-squad2-es",
"hackathon-pln-es/roberta-base-biomedical-es-squad2-es",
"hackathon-pln-es/biomedtra-small-es-squad2-es"]
examples = [
[MODEL_NAMES[2], "¿Qué cidippido se utiliza como descripción de los ctenóforos en la mayoría de los libros de texto?","Para un filo con relativamente pocas especies, los ctenóforos tienen una amplia gama de planes corporales. Las especies costeras necesitan ser lo suficientemente duras para soportar las olas y remolcar partículas de sedimentos, mientras que algunas especies oceánicas son tan frágiles que es muy difícil capturarlas intactas para su estudio. Además, las especies oceánicas no conservan bien, y son conocidas principalmente por fotografías y notas de observadores. Por lo tanto, la mayor atención se ha concentrado recientemente en tres géneros costeros: Pleurobrachia, Beroe y Mnemiopsis. Al menos dos libros de texto basan sus descripciones de ctenóforos en los cidipépidos Pleurobrachia."],
[MODEL_NAMES[0], "¿Dónde se atasca un fagocito en un patógeno?", "La fagocitosis es una característica importante de la inmunidad celular innata llevada a cabo por células llamadas fagocitos que absorben, o comen, patógenos o partículas. Los fagocitos generalmente patrullan el cuerpo en busca de patógenos, pero pueden ser llamados a lugares específicos por citoquinas. Una vez que un patógeno ha sido absorbido por un fagocito, queda atrapado en una vesícula intracelular llamada fagosoma, que posteriormente se fusiona con otra vesícula llamada lisosoma para formar un fagocito."],
]
def getanswer(model_name, question, context):
question_answerer = pipeline("question-answering", model=model_name, device=device)
response = question_answerer({
'question': question,
'context': context
})
return response['answer'],response['score']
face = gr.Interface(
fn=getanswer,
inputs=[
gr.inputs.Radio(
label="Pick a QA Model",
choices=MODEL_NAMES,
),
gr.inputs.Textbox(lines=1, placeholder="Question Here… "),
gr.inputs.Textbox(lines=10, placeholder="Context Here… ")
],
outputs=[
gr.outputs.Textbox(label="Answer"),
gr.outputs.Textbox(label="Score"),
],
layout="vertical",
title=title,
examples=examples,
description=description,
article=article,
allow_flagging ="never"
)
face.launch() |