Lugha-Llama
/

Lugha-Llama-8B-wura_math

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Happyb commited on Dec 23, 2024

Commit

b4cdd00

·

verified ·

1 Parent(s): d6ee170

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ languages commonly spoken on the African continent.
 To train the model, we sample as uniformly as possible across languages while limiting the number of times data is repeated and upsample rare languages by at most four epochs.
 We combine [WURA data](https://huggingface.co/datasets/castorini/wura) with high-quality English documents from [FineWeb-Edu](https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1) and [OpenWebMath](https://huggingface.co/datasets/open-web-math/open-web-math) which results into improved Lugha-Llama-Edu and Lugha-Llama-Maths models respectively.
-Our models consistently achieve the best performance amongst similary-sized baselines on AfriMMLU, AfriMGSM, AfriXNLI in Irokobench.
 In a separate ablation experiment, we translate English education documents to Swahili to study whether the performance gains from FineWeb-Edu data is due to its content or English source language. [FineWeb_Edu-swahili-translated](https://huggingface.co/datasets/princeton-nlp/fineweb_edu-swahili-translated).

 To train the model, we sample as uniformly as possible across languages while limiting the number of times data is repeated and upsample rare languages by at most four epochs.
 We combine [WURA data](https://huggingface.co/datasets/castorini/wura) with high-quality English documents from [FineWeb-Edu](https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1) and [OpenWebMath](https://huggingface.co/datasets/open-web-math/open-web-math) which results into improved Lugha-Llama-Edu and Lugha-Llama-Maths models respectively.
+Our models consistently achieve the best performance amongst similary-sized baselines on AfriMMLU, AfriMGSM, AfriXNLI tasks in Irokobench.
 In a separate ablation experiment, we translate English education documents to Swahili to study whether the performance gains from FineWeb-Edu data is due to its content or English source language. [FineWeb_Edu-swahili-translated](https://huggingface.co/datasets/princeton-nlp/fineweb_edu-swahili-translated).