|
--- |
|
datasets: |
|
- IlyaGusev/ru_turbo_alpaca |
|
- IlyaGusev/ru_turbo_saiga |
|
- IlyaGusev/ru_sharegpt_cleaned |
|
language: |
|
- ru |
|
pipeline_tag: conversational |
|
--- |
|
|
|
Colab: [link](https://colab.research.google.com/drive/1IBh4FMJPOGZAkX7DYWnIKdav_ZcKatlP) |
|
|
|
v2: |
|
- revision 95876e3d9854e937104f623a5fb7144ca990e8ba |
|
- wandb [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/8p3nfjqv/overview) |
|
- 4 datasets: ru_turbo_alpaca, ru_turbo_saiga, ru_sharegpt_cleaned, oasst1_ru_main_branch |
|
- Datasets merging script: [create_chat_set.py](https://github.com/IlyaGusev/rulm/blob/ef58f3d82d6e7b3784d42167ff69188d3766ab61/self_instruct/src/data_processing/create_chat_set.py) |
|
- Loss: 0.942 |
|
- Context length: 2000 |
|
- Conversational template: `"<s>{role}\n{content}</s>"` |
|
- Possible roles: `["system", "user", "bot"]` |
|
- System prompt: `"Ты — Сайга, русскоязычный автоматический ассистент. Ты разговариваешь с людьми и помогаешь им."` |
|
|
|
v1: |
|
- revision 1ad1cb364e3e245a7a376884111e107cfc013911 |
|
- wandb [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/kx2uytey/overview) |
|
- 3 datasets: ru_turbo_alpaca, ru_turbo_saiga, ru_sharegpt_cleaned |
|
- Loss: 0.883 |
|
- Context length: 2000 |
|
- Conversational template: `"<start>{role}\n{content} <end>\n"` |
|
- Possible roles: `["system", "user", "bot"]`. |
|
- System prompt: `"Ты — Сайга, русскоязычный автоматический ассистент. Ты разговариваешь с людьми и помогаешь им."` |