--- library_name: transformers model_name: Vikhr-Qwen-2.5-1.5B-Instruct base_model: - Qwen/Qwen2.5-1.5B-Instruct language: - ru - en license: apache-2.0 datasets: - Vikhrmodels/GrandMaster-PRO-MAX --- # πŸ’¨πŸ¦… Vikhr-Qwen-2.5-1.5B-Instruct #### RU Π˜Π½ΡΡ‚Ρ€ΡƒΠΊΡ‚ΠΈΠ²Π½Π°Ρ модСль Π½Π° основС **Qwen-2.5-1.5B-Instruct**, обучСнная Π½Π° русскоязычном датасСтС **GrandMaster-PRO-MAX**. Π‘ΠΎΠ·Π΄Π°Π½Π° для высокоэффСктивной ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ тСкстов Π½Π° русском ΠΈ английском языках, обСспСчивая Ρ‚ΠΎΡ‡Π½Ρ‹Π΅ ΠΎΡ‚Π²Π΅Ρ‚Ρ‹ ΠΈ быстроС Π²Ρ‹ΠΏΠΎΠ»Π½Π΅Π½ΠΈΠ΅ Π·Π°Π΄Π°Ρ‡. #### EN Instructive model based on **Qwen-2.5-1.5B-Instruct**, trained on the Russian-language dataset **GrandMaster-PRO-MAX**. Designed for high-efficiency text processing in Russian and English, delivering precise responses and fast task execution. ## GGUF - [Vikhrmodels/Vikhr-Qwen-2.5-1.5B-Instruct-GGUF](https://huggingface.co/Vikhrmodels/Vikhr-Qwen-2.5-1.5B-Instruct-GGUF) ## ΠžΡΠΎΠ±Π΅Π½Π½ΠΎΡΡ‚ΠΈ: - πŸ“š Основа / Base: [Qwen-2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) - πŸ‡·πŸ‡Ί БпСциализация / Specialization: **RU** - πŸ’Ύ ДатасСт / Dataset: [GrandMaster-PRO-MAX](https://huggingface.co/datasets/Vikhrmodels/GrandMaster-PRO-MAX) - 🌍 ΠŸΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΊΠ°: **Bilingual RU/EN** ## ΠŸΠΎΠΏΡ€ΠΎΠ±ΠΎΠ²Π°Ρ‚ΡŒ / Try now: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1bJpLmplDGkMbfOLO2CH6IO-2uUZEaknf?usp=sharing) ## ОписаниС: #### RU **Vikhr-Qwen-2.5-1.5B-Instruct** β€” мощная языковая модСль, обучСнная Π½Π° датасСтС **GrandMaster-PRO-MAX**, ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΈΠ²Π°Π΅Ρ‚ Π³Π΅Π½Π΅Ρ€Π°Ρ†ΠΈΡŽ инструкций, контСкстныС ΠΎΡ‚Π²Π΅Ρ‚Ρ‹ ΠΈ Π°Π½Π°Π»ΠΈΠ· тСкста Π½Π° русском языкС. Π­Ρ‚Π° модСль ΠΎΠΏΡ‚ΠΈΠΌΠΈΠ·ΠΈΡ€ΠΎΠ²Π°Π½Π° для Π·Π°Π΄Π°Ρ‡ инструктивного обучСния ΠΈ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ тСкстов. Она ΠΏΠΎΠ΄Ρ…ΠΎΠ΄ΠΈΡ‚ для использования Π² ΠΏΡ€ΠΎΡ„Π΅ΡΡΠΈΠΎΠ½Π°Π»ΡŒΠ½ΠΎΠΉ срСдС, Π° Ρ‚Π°ΠΊΠΆΠ΅ для ΠΈΠ½Ρ‚Π΅Π³Ρ€Π°Ρ†ΠΈΠΈ Π² ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»ΡŒΡΠΊΠΈΠ΅ прилоТСния ΠΈ сСрвисы. #### EN **Vikhr-Qwen-2.5-1.5B-Instruct** is a robust language model trained on the **GrandMaster-PRO-MAX** dataset. It excels in instruction generation, contextual responses, and text analysis in Russian. The model is optimized for instructional tasks and textual data processing, suitable for professional use as well as integration into user-facing applications and services. ## ΠžΠ±ΡƒΡ‡Π΅Π½ΠΈΠ΅ / Training: #### RU **Vikhr-Qwen-2.5-1.5B-Instruct** Π±Ρ‹Π»Π° создана с использованиСм ΠΌΠ΅Ρ‚ΠΎΠ΄Π° SFT (Supervised Fine-Tuning). ΠœΡ‹ использовали синтСтичСский датасСт **GrandMaster-PRO-MAX** (150k инструкций), примСняя ΠΏΠΎΠ΄Ρ…ΠΎΠ΄ CoT (Chain-Of-Thought) ΠΈ ΠΏΡ€ΠΎΠΌΠΏΡ‚Ρ‹ для GPT-4-turbo. Π­Ρ‚ΠΎ ΠΏΠΎΠ·Π²ΠΎΠ»ΠΈΠ»ΠΎ Π΄ΠΎΠ±ΠΈΡ‚ΡŒΡΡ высокой точности ΠΈ когСрСнтности ΠΎΡ‚Π²Π΅Ρ‚ΠΎΠ². #### EN **Vikhr-Qwen-2.5-1.5B-Instruct** was developed using the SFT (Supervised Fine-Tuning) method. The synthetic dataset **GrandMaster-PRO-MAX** (150k instructions) was used with CoT (Chain-Of-Thought) methodology and GPT-4-turbo prompts, enabling high accuracy and coherence in responses. ## ΠŸΡ€ΠΈΠΌΠ΅Ρ€ ΠΊΠΎΠ΄Π° для запуска / Sample code to run: **РСкомСндуСмая Ρ‚Π΅ΠΌΠΏΠ΅Ρ€Π°Ρ‚ΡƒΡ€Π° для Π³Π΅Π½Π΅Ρ€Π°Ρ†ΠΈΠΈ: 0.3** / **Recommended generation temperature: 0.3**. ```python from transformers import AutoModelForCausalLM, AutoTokenizer # Load the model and tokenizer model_name = "Vikhrmodels/Vikhr-Qwen-2.5-1.5B-Instruct" model = AutoModelForCausalLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) # Prepare the input text input_text = "Напиши ΠΊΡ€Π°Ρ‚ΠΊΠΎΠ΅ описаниС ΠΊΠ½ΠΈΠ³ΠΈ Π“Π°Ρ€Ρ€ΠΈ ΠŸΠΎΡ‚Ρ‚Π΅Ρ€." messages = [ {"role": "system", "content": "Π’Ρ‹ β€” Vikhr, ИИ ΠΏΠΎΠΌΠΎΡ‰Π½ΠΈΠΊ, созданный ΠΊΠΎΠΌΠΏΠ°Π½ΠΈΠ΅ΠΉ Vikhr models для прСдоставлСния ΠΏΠΎΠ»Π΅Π·Π½ΠΎΠΉ, чСстной ΠΈ бСзопасной ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΈ."}, {"role": "user", "content": input_text}, ] # Tokenize and generate text input_ids = tokenizer.apply_chat_template(messages, truncation=True, add_generation_prompt=True, return_tensors="pt") output = model.generate( input_ids, max_length=1512, temperature=0.3, num_return_sequences=1, no_repeat_ngram_size=2, top_k=50, top_p=0.95, ) # Decode and print result generated_text = tokenizer.decode(output[0], skip_special_tokens=True) print(generated_text) ``` #### ΠžΡ‚Π²Π΅Ρ‚ ΠΌΠΎΠ΄Π΅Π»ΠΈ / Model response: >Книга "Π“Π°Ρ€Ρ€ΠΈ ΠŸΠΎΡ‚Ρ‚Π΅Ρ€" β€” это популярноС ΠΏΡ€ΠΎΠΈΠ·Π²Π΅Π΄Π΅Π½ΠΈΠ΅ Π² ΠΆΠ°Π½Ρ€Π΅ фэнтСзи, ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠ΅ исслСдуСт Ρ‚Π΅ΠΌΡ‹ Π΄Ρ€ΡƒΠΆΠ±Ρ‹, ΠΌΠ°Π³ΠΈΠΈ ΠΈ Π±ΠΎΡ€ΡŒΠ±Ρ‹ со Π·Π»ΠΎΠΌ. Π“Π»Π°Π²Π½Ρ‹ΠΉ Π³Π΅Ρ€ΠΎΠΉ ΠΏΡ€ΠΎΡ…ΠΎΠ΄ΠΈΡ‚ ΠΏΡƒΡ‚ΡŒ взрослСния, прСодолСвая прСпятствия ΠΈ ΡΡ‚Π°Π»ΠΊΠΈΠ²Π°ΡΡΡŒ с ΠΌΠΎΡ€Π°Π»ΡŒΠ½Ρ‹ΠΌΠΈ Π²Ρ‹Π·ΠΎΠ²Π°ΠΌΠΈ. ### Авторы / Authors - Sergei Bratchikov, [NLP Wanderer](https://t.me/nlpwanderer), [Vikhr Team](https://t.me/vikhrlabs) - Nikolay Kompanets, [LakoMoor](https://t.me/lakomoordev), [Vikhr Team](https://t.me/vikhrlabs) - Konstantin Korolev, [Vikhr Team](https://t.me/vikhrlabs) - Aleksandr Nikolich, [Vikhr Team](https://t.me/vikhrlabs) ``` @inproceedings{nikolich2024vikhr, title={Vikhr: Advancing Open-Source Bilingual Instruction-Following Large Language Models for Russian and English}, author={Aleksandr Nikolich and Konstantin Korolev and Sergei Bratchikov and Nikolay Kompanets and Igor Kiselev and Artem Shelmanov}, booktitle={Proceedings of the 4th Workshop on Multilingual Representation Learning (MRL) @ EMNLP-2024}, year={2024}, publisher={Association for Computational Linguistics}, url={https://arxiv.org/pdf/2405.13929} } ```