File size: 1,474 Bytes
eba5d98 7e0b91c 66df461 eba5d98 3f1cc60 66df461 e5ba936 66df461 449ad7f 66df461 e5ba936 66df461 e5ba936 66df461 e5ba936 66df461 e5ba936 66df461 e5ba936 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
---
license: mit
language:
- en
- eu
tags:
- text2text-generation
- open-nmt
- pytorch
---
# Itzune v1.0 EN -> EU machine translation argos model
This model was trained using [argostrain](https://github.com/argosopentech/argos-train) training scripts with 11,542,706 English to Basque parallel strings extracted from datasets obtained directly from the [Opus project](https://opus.nlpl.eu/).
## Model description
- **Developed by:** Basque community
- **Model type:** traslation
- **Model version:** v1.0
- **Source Language:** English
- **Target Language:** Basque
- **License:** MIT
## Training Data
The English-Basque parallel sentences were collected from the following datasets:
| Dataset | Sentences before cleaning |
|----------------------|--------------------------:|
| CCMatrix v1 | 7,788,871 |
| OpenSubtitles v2018 | 805,780 |
| XLEnt v1.2 | 800,631 |
| GNOME v1 | 652,298 |
| HPLT v1.1 | 610,694 |
| EhuHac v1 | 585,210 |
| WikiMatrix v1 | 119,480 |
| KDE4 v2 | 100,160 |
| wikimedia v20230407 | 60,990 |
| bible-uedin v1 | 15,893 |
| Tatoeba v2023-04-12 | 2,070 |
| Wiktionary | 629 |
| **Total** | **11,542,706** | |