File size: 3,523 Bytes
f17c982 d8b1504 f17c982 d8b1504 2505963 d8b1504 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
---
license: apache-2.0
datasets:
- HiTZ/CONAN-EUS
language:
- eu
metrics:
- bleu
library_name: transformers
pipeline_tag: text2text-generation
tags:
- counternarrative
- hate speech
- text generation
---
**Content Warning**: This card may contain examples of offensive language that do not reflect the authors’ views
# Model Card for mT5-counternarrative-es
This is a fine-tuned text-to-text [mT5-base](https://huggingface.co/google/mt5-base) model to generate counternarratives against hate speech in **Basque**.
The model has been fine-tuned on the Basque splits of the [CONAN-EUS](https://huggingface.co/datasets/HiTZ/CONAN-EUS) dataset.
[CONAN-EUS](https://huggingface.co/datasets/HiTZ/CONAN-EUS) was created by professionally translating all 6654 English HS-CN pairs of the original CONAN dataset into
**Basque and Spanish**. For experimentation we generated train, validation and test splits in a way that no HS-CN pairs occurred across them.
<table style="width:33%">
<tr>
<th>CONAN-EUS Splits</th>
<th>Total HS-CN Count</th>
<tr>
<td>train</td>
<td>4833</td>
</tr>
<tr>
<td>validation</td>
<td>537</td>
</tr>
<tr>
<td>test</td>
<td>1278</td>
</tr>
</table>
- 📖 Paper: [Basque and Spanish Counter Narrative Generation: Data Creation and Evaluation](https://arxiv.org/abs/2403.09159) In LREC-COLING 2024.
- 💻 Github Repo (Data and Code): [https://github.com/ixa-ehu/conan-e/](https://github.com/ixa-ehu/conan-e/)
## HS-CN example
The CONAN dataset includes HS-CN pairs, regarding the topic of Muslims and Islam.
An example of a HS-CN pair in Basque, Spanish and English is illustrated below:
| HS | CN |
|-------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|
| Musulmanok ez dugu gure kultura aberastu dezakeen ezer erabilgarririk. | Zer egin dute guretzat musulmanek? Beno, kafea, erlojuak, kamerak, higienea, unibertsitateak, tresna kirurgikoak, mapak, musika, aljebra. |
| Los musulmanes no tienen nada útil que pueda enriquecer nuestra cultura.| ¿Qué han hecho los musulmanes por nosotros? Bueno, café, relojes, cámaras, higiene, universidades, instrumentos quirúrgicos, mapas, música, álgebra.|
| Muslims do not have anything useful that can enrich our culture.| What have Muslims ever done for us? Well, Coffee, Clocks, Cameras, Hygiene, Universities, Surgical Instruments, Maps, Music, Algebra.|
If you use the model please **cite the following paper**:
## Citation
```bibtex
@inproceedings{bengoetxea-et-al-2024,
title={{B}asque and {S}panish {C}ounter {N}arrative {G}eneration: {D}ata {C}reation and {E}valuation},
author={Jaione Bengoetxea and Yi-Ling Chung and Marco Guerini and Rodrigo Agerri},
year={2024},
publisher = "Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)",
}
```
**Contact**: [Rodrigo Agerri](https://ragerri.github.io/)
HiTZ Center - Ixa, University of the Basque Country UPV/EHU |