HelpMumHQ
/

AI-translator-9ja-to-eng

@@ -1,35 +1,89 @@
 ---
 library_name: transformers
 license: mit
-base_model: HelpMum-Personal/9ja-to-eng
 tags:
 - translation
 - generated_from_trainer
 model-index:
-- name: 9ja-to-eng2
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# 9ja-to-eng2
-This model is a fine-tuned version of [HelpMum-Personal/9ja-to-eng](https://huggingface.co/HelpMum-Personal/9ja-to-eng) on an unknown dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
@@ -42,14 +96,10 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: linear
 - num_epochs: 1
 - mixed_precision_training: Native AMP
-### Training results
 ### Framework versions
 - Transformers 4.44.2
-- Pytorch 2.4.1+cu121
-- Datasets 3.0.0
-- Tokenizers 0.19.1

 ---
 library_name: transformers
 license: mit
+base_model: facebook/m2m100_418M
 tags:
 - translation
 - generated_from_trainer
 model-index:
+- name: m2m100_418M-nig-en
   results: []
+language:
+- yo
+- ig
+- ha
+pipeline_tag: translation
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# AI-translator-9ja-to-eng
+This model is a 418 Million parameter  translation model, built for translating from Yoruba, Igbo, and Hausa into English. It was trained on a dataset consisting of 1,500,000 sentences (500,000 for each language), providing high-quality translations for these languages.
+It was built with the intention of building a system that makes it easier to communicate with LLMs using Igbo, Hausa and Yoruba languages.
+## Model Details
+- **Languages Supported**:
+  - Source Language: Yoruba, Igbo, Hausa
+  - Target Languages: English
+### Model Usage
+To use this model for translation tasks, you can load it from Hugging Face’s `transformers` library:
+```python
+from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
+# Load the fine-tuned model
+model = M2M100ForConditionalGeneration.from_pretrained("HelpMum-Personal/AI-translator-9ja-to-eng")
+tokenizer = M2M100Tokenizer.from_pretrained("HelpMum-Personal/AI-translator-9ja-to-eng")
+# translate igbo to English
+igbo_text="Nlekọta ahụike bụ mpaghara dị mkpa n'ihe fọrọ nke nta ka ọ bụrụ obodo ọ bụla n'ihi na ọ na-emetụta ọdịmma na ịdịmma ndụ nke ndị mmadụ n'otu n'otu. Ọ gụnyere ọtụtụ ọrụ na ọrụ dị iche iche, gụnyere nlekọta mgbochi, nchoputa, ọgwụgwọ na njikwa ọrịa na ọnọdụ. Usoro nlekọta ahụike dị mma na-achọ imeziwanye nsonaazụ ahụike, belata ọrịa ọrịa, yana hụ na ndị mmadụ n'otu n'otu nwere ohere ịnweta ọrụ ahụike dị mkpa."
+tokenizer.src_lang = "ig"
+tokenizer.tgt_lang = "en"
+encoded_ig = tokenizer(igbo_text, return_tensors="pt")
+generated_tokens = model.generate(**encoded_ig, forced_bos_token_id=tokenizer.get_lang_id("en"))
+tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
+# translate yoruba to English
+yoruba_text="Itọju ilera jẹ aaye pataki ni o fẹrẹ to gbogbo awujọ nitori pe o taara ni ilera ati didara igbesi aye eniyan kọọkan. O ni awọn iṣẹ lọpọlọpọ ati awọn oojọ, pẹlu itọju idena, iwadii aisan, itọju, ati iṣakoso awọn arun ati awọn ipo. Awọn eto ilera ti o munadoko ṣe ifọkansi lati ni ilọsiwaju awọn abajade ilera, dinku iṣẹlẹ ti aisan, ati rii daju pe awọn eniyan kọọkan ni iraye si awọn iṣẹ iṣoogun pataki."
+tokenizer.src_lang = "yo"
+tokenizer.tgt_lang = "en"
+encoded_yo = tokenizer(yoruba_text, return_tensors="pt")
+generated_tokens = model.generate(**encoded_yo, forced_bos_token_id=tokenizer.get_lang_id("en"))
+tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
+# translate Hausa to English
+hausa_text="Kiwon lafiya fage ne mai mahimmanci a kusan kowace al'umma domin yana shafar jin daɗi da ingancin rayuwar ɗaiɗaikun kai tsaye. Ya ƙunshi nau'ikan ayyuka da sana'o'i, gami da kulawa na rigakafi, ganewar asali, jiyya, da kula da cututtuka da yanayi. Ingantattun tsarin kiwon lafiya na nufin inganta sakamakon kiwon lafiya, rage yawan kamuwa da cututtuka, da kuma tabbatar da cewa mutane sun sami damar yin amfani da ayyukan likita masu mahimmanci."
+tokenizer.src_lang = "ha"
+tokenizer.tgt_lang = "en"
+encoded_ha = tokenizer(hausa_text, return_tensors="pt")
+generated_tokens = model.generate(**encoded_ha, forced_bos_token_id=tokenizer.get_lang_id("en"))
+tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
+```
+### Supported Language Codes
+- **English**: `en`
+- **Yoruba**: `yo`
+- **Igbo**: `ig`
+- **Hausa**: `ha`
+### Training Dataset
+The training dataset consists of 1,500,000 translation pairs, sourced from a combination of open-source parallel corpora and curated datasets specific to Yoruba, Igbo, and Hausa
+## Limitations
+- While the model performs well across Yoruba, Igbo, and Hausa to English translations, performance may vary depending on the complexity and domain of the text.
+- Translation quality may decrease for extremely long sentences or ambiguous contexts.
 ### Training hyperparameters
 - lr_scheduler_type: linear
 - num_epochs: 1
 - mixed_precision_training: Native AMP
+-
 ### Framework versions
 - Transformers 4.44.2
+- Pytorch 2.4.0+cu121
+- Datasets 2.21.0
+- Tokenizers 0.19.1