Meno-Tiny-0.1-GGUF
Introduction
Meno-Tiny-0.1 is a descendant of the Qwen2.5-1.5B-Instruct model, which was fine-tuned on a special Russian instruct dataset. It is a 1.5B parameter language model with a decoder. It is based on the Transformer architecture with SwiGLU activation, attention QKV bias, group query attention, etc. The name "Meno" is associated with the adaptation of this model for answering questions from text in the RAG pipeline (in honor of the theory of knowledge as recollection from the Socratic dialogue "Meno").
Quickstart
Check out our llama.cpp documentation for more usage guide.
We advise you to clone llama.cpp
and install it following the official guide. We follow the latest version of llama.cpp.
In the following demonstration, we assume that you are running commands under the repository llama.cpp
.
Since cloning the entire repo may be inefficient, you can manually download the GGUF file that you need or use huggingface-cli
:
- Install
pip install -U huggingface_hub
- Download:
huggingface-cli download bond005/Meno-Tiny-0.1-GGUF meno-tiny-0.1-fp16.gguf --local-dir . --local-dir-use-symlinks False
For users, to achieve chatbot-like experience, it is recommended to commence in the conversation mode:
./llama-cli -m <gguf-file-path> \
-co -cnv -p "You are Meno, created by Ivan Bondarenko. You are a helpful assistant." \
-fa -ngl 80 -n 512
Evaluation & Performance
Detailed evaluation results are reported in this model card.
Citation
If you find our work helpful, feel free to give us a cite.
@misc{bondarenko2024meno,
title={Meno-Tiny: A Small Russian Language Model for Question Answering and Other Useful NLP Tasks in Russian},
author={Bondarenko, Ivan},
publisher={Hugging Face},
journal={Hugging Face Hub},
howpublished={\url{https://huggingface.co/bond005/meno-tiny-0.1}},
year={2024}
}
- Downloads last month
- 18