Quantization made by Richard Erkhov. [Github](https://github.com/RichardErkhov) [Discord](https://discord.gg/pvy7H8DZMG) [Request more models](https://github.com/RichardErkhov/quant_request) Llammas - GGUF - Model creator: https://huggingface.co/tartuNLP/ - Original model: https://huggingface.co/tartuNLP/Llammas/ | Name | Quant method | Size | | ---- | ---- | ---- | | [Llammas.Q2_K.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q2_K.gguf) | Q2_K | 2.36GB | | [Llammas.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.IQ3_XS.gguf) | IQ3_XS | 2.6GB | | [Llammas.IQ3_S.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.IQ3_S.gguf) | IQ3_S | 2.75GB | | [Llammas.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q3_K_S.gguf) | Q3_K_S | 2.75GB | | [Llammas.IQ3_M.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.IQ3_M.gguf) | IQ3_M | 2.9GB | | [Llammas.Q3_K.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q3_K.gguf) | Q3_K | 3.07GB | | [Llammas.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q3_K_M.gguf) | Q3_K_M | 3.07GB | | [Llammas.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q3_K_L.gguf) | Q3_K_L | 3.35GB | | [Llammas.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.IQ4_XS.gguf) | IQ4_XS | 3.4GB | | [Llammas.Q4_0.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q4_0.gguf) | Q4_0 | 3.56GB | | [Llammas.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.IQ4_NL.gguf) | IQ4_NL | 3.58GB | | [Llammas.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q4_K_S.gguf) | Q4_K_S | 3.59GB | | [Llammas.Q4_K.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q4_K.gguf) | Q4_K | 3.8GB | | [Llammas.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q4_K_M.gguf) | Q4_K_M | 3.8GB | | [Llammas.Q4_1.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q4_1.gguf) | Q4_1 | 3.95GB | | [Llammas.Q5_0.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q5_0.gguf) | Q5_0 | 4.33GB | | [Llammas.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q5_K_S.gguf) | Q5_K_S | 4.33GB | | [Llammas.Q5_K.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q5_K.gguf) | Q5_K | 4.45GB | | [Llammas.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q5_K_M.gguf) | Q5_K_M | 4.45GB | | [Llammas.Q5_1.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q5_1.gguf) | Q5_1 | 4.72GB | | [Llammas.Q6_K.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q6_K.gguf) | Q6_K | 5.15GB | | [Llammas.Q8_0.gguf](https://huggingface.co/RichardErkhov/tartuNLP_-_Llammas-gguf/blob/main/Llammas.Q8_0.gguf) | Q8_0 | 6.67GB | Original model description: --- language: - et - en pipeline_tag: text-generation library_name: transformers tags: - conversational --- # LLammas 🐑 Llama-2-7B finetuned in two stages: 1. 5B tokens of CulturaX with 75% of documents in Estonain and 25% in English (see [Llammas-base](https://huggingface.co/tartuNLP/Llammas-base)), 2. Alpaca-cleaned, Alpaca-est, OASST1 top-1 English conversations, CoT and FLAN-V2 following open-instruct (both 10,000), WMT18 English-Estonian translation development data (as documents), general MTee validation English-Estonian held-out data. [Alpaca-est](https://github.com/TartuNLP/alpaca-est) is an instruction dataset generated for Estonian with *gpt-3.5-turbo-0613*, following Alpaca. More details in our [paper](https://arxiv.org/abs/2404.04042). Additional resources: * Paper: [arxiv.org/abs/2404.04042](https://arxiv.org/abs/2404.04042) * Code: [github.com/TartuNLP/llammas](https://github.com/TartuNLP/llammas) * Base model: [tartuNLP/Llammas-base](https://huggingface.co/tartuNLP/Llammas-base) * 4-bit quantized model in GGUF: [AlbertUnn/LlammasGGUF](https://huggingface.co/AlbertUnn/LlammasGGUF) * Alpaca-est dataset: [github.com/TartuNLP/alpaca-est](https://github.com/TartuNLP/alpaca-est) ### Using the model Using the model in a text-generation pipeline: ``` from transformers import pipeline import torch pipe = pipeline("text-generation", model="tartuNLP/Llammas", torch_dtype=torch.bfloat16, device_map="auto") messages = [ {"role": "user", "content": "Tere!"}, {"role": "assistant", "content": "Tere! Kas saaksin teid kuidagi aidata?"}, {"role": "user", "content": "Kuidas alustada kirja kirjutamist?"} ] prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.6, top_k=50, top_p=0.9) print(outputs[0]["generated_text"][len(prompt):]) ``` Using the model in a conversational pipeline (works with transformers==4.36.2, issues with output in newer versions): ``` from transformers import pipeline, Conversation import torch pipe = pipeline("conversational", model="tartuNLP/Llammas", torch_dtype=torch.bfloat16, device_map="auto") messages = [ {"role": "user", "content": "Tere!"}, {"role": "assistant", "content": "Tere! Kas saaksin teid kuidagi aidata?"}, {"role": "user", "content": "Kuidas alustada kirja kirjutamist?"} ] conversation = Conversation(messages) conversation = pipe(conversation) ``` Conversational format: ``` <|user|> Tere! <|assistant|> Tere! Kas saaksin teid kuidagi aidata? <|user|> Kuidas alustada kirja kirjutamist? <|assistant|> Kirja kirjutamiseks alustage tervitusega, nĂ€iteks "Tere!" vĂ”i "Tere hommikust!". SeejĂ€rel tutvustage ennast ja mainige, kellega kirjutate. Kirjeldage oma mĂ”tteid vĂ”i kĂŒsimusi, mida soovite arutada. LĂ”petage kiri viisakalt, nĂ€iteks "TĂ€nan teid tĂ€helepanu eest!" vĂ”i "Parimate soovidega!" ``` ### Citation ``` @misc{kuulmets2024teaching, title={Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer}, author={Hele-Andra Kuulmets and Taido Purason and Agnes Luhtaru and Mark Fishel}, year={2024}, eprint={2404.04042}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```