TakoMT
This is a translation model using Marian-NMT. For more details, please see my repository.
In addition to the data listed in the repository I also used ParaCrawl.
- source languages: de, en, es, fr, it, ru, uk
- target language: ja
How to use
This model uses transformers and sentencepiece.
!pip install transformers sentencepiece
You can use this model directly with a pipeline:
from transformers import pipeline
tako_translator = pipeline('translation', model='staka/takomt')
tako_translator('This is a cat.')
Eval results
The results of the evaluation using tatoeba(randomly selected 500 sentences) are as follows:
source | target | BLEU(*1) |
---|---|---|
de | ja | 27.8 |
en | ja | 28.4 |
es | ja | 32.0 |
fr | ja | 27.9 |
it | ja | 24.3 |
ru | ja | 27.3 |
uk | ja | 29.8 |
(*1) sacrebleu --tokenize ja-mecab
- Downloads last month
- 141
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.