Gallek

  • A French -> Breton Translation Model called Gallek (meaning "French" in Breton).
  • The current model version reached a BLEU score of 40 on a 20% split of the training set.
  • Only monodirectionally fr->br fine-tuned for now.
  • Training details available on the GweLLM Github repository.

Sample test code:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

modelcard = "amurienne/gallek-m2m100"

model = AutoModelForSeq2SeqLM.from_pretrained(modelcard)
tokenizer = AutoTokenizer.from_pretrained(modelcard)

translation_pipeline = pipeline("translation", model=model, tokenizer=tokenizer, src_lang='fr', tgt_lang='br', max_length=512, device="cpu")

french_text = "traduis de français en breton: j'apprends le breton à l'école."

result = translation_pipeline(french_text)
print(result[0]['translation_text'])

Demo is available on the Gallek Space

Downloads last month
32
Safetensors
Model size
484M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for amurienne/gallek-m2m100

Finetuned
(56)
this model

Datasets used to train amurienne/gallek-m2m100

Space using amurienne/gallek-m2m100 1