--- license: mit language: - en - eu tags: - text2text-generation - open-nmt - pytorch --- # Itzune v1.0 EN -> EU machine translation argos model This model was trained using [argostrain](https://github.com/argosopentech/argos-train) training scripts with 11,542,706 English to Basque parallel strings extracted from datasets obtained directly from the [Opus project](https://opus.nlpl.eu/). ## Model description - **Developed by:** Basque community - **Model type:** traslation - **Model version:** v1.0 - **Source Language:** English - **Target Language:** Basque - **License:** MIT ## Training Data The English-Basque parallel sentences were collected from the following datasets: | Dataset | Sentences before cleaning | |----------------------|--------------------------:| | CCMatrix v1 | 7,788,871 | | OpenSubtitles v2018 | 805,780 | | XLEnt v1.2 | 800,631 | | GNOME v1 | 652,298 | | HPLT v1.1 | 610,694 | | EhuHac v1 | 585,210 | | WikiMatrix v1 | 119,480 | | KDE4 v2 | 100,160 | | wikimedia v20230407 | 60,990 | | bible-uedin v1 | 15,893 | | Tatoeba v2023-04-12 | 2,070 | | Wiktionary | 629 | | **Total** | **11,542,706** |