tartuNLP
/

Llammas-base

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

taidopurason commited on Apr 24, 2024

Commit

718e479

·

verified ·

1 Parent(s): e21364d

Added paper and citation info

Files changed (1) hide show

README.md +17 -1

README.md CHANGED Viewed

@@ -6,4 +6,20 @@ pipeline_tag: text-generation
 ---
 # LLammas-base 🐑
-Llama-2-7B with continued pre-training of 5B tokens of CulturaX (Documents: 75% Estonian, 25% English).

 ---
 # LLammas-base 🐑
+Llama-2-7B with continued pre-training of 5B tokens of CulturaX (75% Estonian, 25% English documents).
+This model is also instruction-tuned resulting in [Llammas](https://huggingface.co/tartuNLP/Llammas).
+More details in our [paper](https://arxiv.org/abs/2404.04042).
+### Citation
+```
+@misc{kuulmets2024teaching,
+      title={Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer},
+      author={Hele-Andra Kuulmets and Taido Purason and Agnes Luhtaru and Mark Fishel},
+      year={2024},
+      eprint={2404.04042},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```