Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,74 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- HiTZ/CONAN-EUS
|
5 |
+
language:
|
6 |
+
- eu
|
7 |
+
metrics:
|
8 |
+
- bleu
|
9 |
+
library_name: transformers
|
10 |
+
pipeline_tag: text2text-generation
|
11 |
+
tags:
|
12 |
+
- counternarrative
|
13 |
+
- hate speech
|
14 |
+
- text generation
|
15 |
---
|
16 |
+
**Content Warning**: This card may contain examples of offensive language that do not reflect the authors’ views
|
17 |
+
|
18 |
+
# Model Card for mT5-counternarrative-es
|
19 |
+
|
20 |
+
This is a fine-tuned text-to-text [mT5-base](https://huggingface.co/google/mt5-base) model to generate counternarratives against hate speech in **Basque**.
|
21 |
+
The model has been fine-tuned on the Basque splits of the [CONAN-EUS](https://huggingface.co/datasets/HiTZ/CONAN-EUS) dataset.
|
22 |
+
|
23 |
+
[CONAN-EUS](https://huggingface.co/datasets/HiTZ/CONAN-EUS) was created by professionally translating all 6654 English HS-CN pairs of the original CONAN dataset into
|
24 |
+
**Basque and Spanish**. For experimentation we generated train, validation and test splits in a way that no HS-CN pairs occurred across them.
|
25 |
+
|
26 |
+
<table style="width:33%">
|
27 |
+
<tr>
|
28 |
+
<th>CONAN-EUS Splits</th>
|
29 |
+
<th>Total HS-CN Count</th>
|
30 |
+
<tr>
|
31 |
+
<td>train</td>
|
32 |
+
<td>4833</td>
|
33 |
+
</tr>
|
34 |
+
<tr>
|
35 |
+
<td>validation</td>
|
36 |
+
<td>537</td>
|
37 |
+
</tr>
|
38 |
+
<tr>
|
39 |
+
<td>test</td>
|
40 |
+
<td>1278</td>
|
41 |
+
</tr>
|
42 |
+
</table>
|
43 |
+
|
44 |
+
- 📖 Paper: [Basque and Spanish Counter Narrative Generation: Data Creation and Evaluation]() In LREC-COLING 2024.
|
45 |
+
- 💻 Github Repo (Data and Code): [https://github.com/ixa-ehu/conan-e/](https://github.com/ixa-ehu/conan-e/)
|
46 |
+
|
47 |
+
|
48 |
+
## HS-CN example
|
49 |
+
|
50 |
+
The CONAN dataset includes HS-CN pairs, regarding the topic of Muslims and Islam.
|
51 |
+
An example of a HS-CN pair in Basque, Spanish and English is illustrated below:
|
52 |
+
|
53 |
+
| HS | CN |
|
54 |
+
|-------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|
|
55 |
+
| Musulmanok ez dugu gure kultura aberastu dezakeen ezer erabilgarririk. | Zer egin dute guretzat musulmanek? Beno, kafea, erlojuak, kamerak, higienea, unibertsitateak, tresna kirurgikoak, mapak, musika, aljebra. |
|
56 |
+
| Los musulmanes no tienen nada útil que pueda enriquecer nuestra cultura.| ¿Qué han hecho los musulmanes por nosotros? Bueno, café, relojes, cámaras, higiene, universidades, instrumentos quirúrgicos, mapas, música, álgebra.|
|
57 |
+
| Muslims do not have anything useful that can enrich our culture.| What have Muslims ever done for us? Well, Coffee, Clocks, Cameras, Hygiene, Universities, Surgical Instruments, Maps, Music, Algebra.|
|
58 |
+
|
59 |
+
If you use the model please **cite the following paper**:
|
60 |
+
|
61 |
+
## Citation
|
62 |
+
|
63 |
+
```bibtex
|
64 |
+
@inproceedings{bengoetxea-et-al-2024,
|
65 |
+
title={{B}asque and {S}panish {C}ounter {N}arrative {G}eneration: {D}ata {C}reation and {E}valuation},
|
66 |
+
author={Jaione Bengoetxea and Yi-Ling Chung and Marco Guerini and Rodrigo Agerri},
|
67 |
+
year={2024},
|
68 |
+
publisher = "Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)",
|
69 |
+
}
|
70 |
+
```
|
71 |
+
|
72 |
+
|
73 |
+
**Contact**: [Rodrigo Agerri](https://ragerri.github.io/)
|
74 |
+
HiTZ Center - Ixa, University of the Basque Country UPV/EHU
|