ragerri commited on
Commit
d8b1504
·
verified ·
1 Parent(s): e3265ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -0
README.md CHANGED
@@ -1,3 +1,74 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - HiTZ/CONAN-EUS
5
+ language:
6
+ - eu
7
+ metrics:
8
+ - bleu
9
+ library_name: transformers
10
+ pipeline_tag: text2text-generation
11
+ tags:
12
+ - counternarrative
13
+ - hate speech
14
+ - text generation
15
  ---
16
+ **Content Warning**: This card may contain examples of offensive language that do not reflect the authors’ views
17
+
18
+ # Model Card for mT5-counternarrative-es
19
+
20
+ This is a fine-tuned text-to-text [mT5-base](https://huggingface.co/google/mt5-base) model to generate counternarratives against hate speech in **Basque**.
21
+ The model has been fine-tuned on the Basque splits of the [CONAN-EUS](https://huggingface.co/datasets/HiTZ/CONAN-EUS) dataset.
22
+
23
+ [CONAN-EUS](https://huggingface.co/datasets/HiTZ/CONAN-EUS) was created by professionally translating all 6654 English HS-CN pairs of the original CONAN dataset into
24
+ **Basque and Spanish**. For experimentation we generated train, validation and test splits in a way that no HS-CN pairs occurred across them.
25
+
26
+ <table style="width:33%">
27
+ <tr>
28
+ <th>CONAN-EUS Splits</th>
29
+ <th>Total HS-CN Count</th>
30
+ <tr>
31
+ <td>train</td>
32
+ <td>4833</td>
33
+ </tr>
34
+ <tr>
35
+ <td>validation</td>
36
+ <td>537</td>
37
+ </tr>
38
+ <tr>
39
+ <td>test</td>
40
+ <td>1278</td>
41
+ </tr>
42
+ </table>
43
+
44
+ - 📖 Paper: [Basque and Spanish Counter Narrative Generation: Data Creation and Evaluation]() In LREC-COLING 2024.
45
+ - 💻 Github Repo (Data and Code): [https://github.com/ixa-ehu/conan-e/](https://github.com/ixa-ehu/conan-e/)
46
+
47
+
48
+ ## HS-CN example
49
+
50
+ The CONAN dataset includes HS-CN pairs, regarding the topic of Muslims and Islam.
51
+ An example of a HS-CN pair in Basque, Spanish and English is illustrated below:
52
+
53
+ | HS | CN |
54
+ |-------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|
55
+ | Musulmanok ez dugu gure kultura aberastu dezakeen ezer erabilgarririk. | Zer egin dute guretzat musulmanek? Beno, kafea, erlojuak, kamerak, higienea, unibertsitateak, tresna kirurgikoak, mapak, musika, aljebra. |
56
+ | Los musulmanes no tienen nada útil que pueda enriquecer nuestra cultura.| ¿Qué han hecho los musulmanes por nosotros? Bueno, café, relojes, cámaras, higiene, universidades, instrumentos quirúrgicos, mapas, música, álgebra.|
57
+ | Muslims do not have anything useful that can enrich our culture.| What have Muslims ever done for us? Well, Coffee, Clocks, Cameras, Hygiene, Universities, Surgical Instruments, Maps, Music, Algebra.|
58
+
59
+ If you use the model please **cite the following paper**:
60
+
61
+ ## Citation
62
+
63
+ ```bibtex
64
+ @inproceedings{bengoetxea-et-al-2024,
65
+ title={{B}asque and {S}panish {C}ounter {N}arrative {G}eneration: {D}ata {C}reation and {E}valuation},
66
+ author={Jaione Bengoetxea and Yi-Ling Chung and Marco Guerini and Rodrigo Agerri},
67
+ year={2024},
68
+ publisher = "Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)",
69
+ }
70
+ ```
71
+
72
+
73
+ **Contact**: [Rodrigo Agerri](https://ragerri.github.io/)
74
+ HiTZ Center - Ixa, University of the Basque Country UPV/EHU