--- license: cc-by-nc-4.0 datasets: - barbaroo/Sprotin_parallel - barbaroo/fo_en_synthetic language: - en - fo metrics: - bleu - chrf - bertscore base_model: - facebook/nllb-200-distilled-1.3B pipeline_tag: translation --- # barbaroo/nllb_200_1.3B_en_fo ## Model Description - **Model Architecture**: This model is based on the [NLLB 1.3B architecture](https://huggingface.co/facebook/nllb-200-distilled-1.3B) and weights. - **Languages**: This checkpoint is fine-tuned to translate from **English** (`en`) to **Faroese** (`fo`). - **Size**: ~1.3B parameters. - **Finetuning Datasets**: - [Sprotin_parallel](https://huggingface.co/datasets/barbaroo/Sprotin_parallel) - [fo_en_synthetic](https://huggingface.co/datasets/barbaroo/fo_en_synthetic) - **Training Regime**: Trained until convergence (about 2 epochs). - **License**: Inherits the original licenses of the [NLLB 1.3B model](https://huggingface.co/facebook/nllb-200-distilled-1.3B). ## Intended Use - **Primary Use Case**: Translate text from English to Faroese. - **Audience**: Researchers, developers, or anyone interested in Faroese language processing. - **Usage Scenarios**: - Building Faroese-English translation tools - Language research and corpus analysis - Synthetic data creation > **Important**: While the model can produce fluent translations, it is not guaranteed to be perfectly accurate on all inputs. Users should verify critical or sensitive content through human experts. ## Metrics - **Model performance measures**: This model was evaluated using **BLEU**, **chrF** and **BERT-score** —metrics widely adopted by the machine translation community. Additionally, human evaluation was performed by two human experts using the ESA framework on a small dataset (about 200 sentences) of English sentences from news outlets (BBC, CNN, Al Jazeera). --- ## Evaluation Data - **Datasets**: Flores-200 dataset is described in Section 4 of the NLLB paper/documentation. - **Motivation**: Flores-200 is currently the only machine translation benchmark available for Faroese. ## How to Use Below is a simple usage example in Python with [Hugging Face Transformers](https://github.com/huggingface/transformers): ```python from transformers import pipeline model_name = "barbaroo/nllb_200_600M_en_fo" translator = pipeline( "translation", model=model_name, tokenizer=model_name, src_lang="eng_Latn", # Language code for English tgt_lang="fao_Latn" # Language code for Faroese ) text = "Hello, how are you?" translation = translator(text) print(translation) ``` ## Citation If you use this model or find it helpful in your research, please cite: [COMING SOON] ## Contact For questions, feedback, or collaboration inquiries, feel free to reach out: - **Primary Contact**: < Barbara Scalvini/ barbaras@setur.fo / barbaralongview@gmail.com >