File size: 1,375 Bytes
16a46d6 4d32216 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b 16a46d6 06d923b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
---
library_name: transformers
tags:
- trl
- sft
base_model:
- meta-llama/Llama-3.2-1B-Instruct
datasets:
- ngxson/MiniThinky-dataset
---
# MiniThinky 1B
This is the newer checkpoint of [MiniThinky-1B-Llama-3.2 (version 1)](https://huggingface.co/ngxson/MiniThinky-1B-Llama-3.2), which the loss decreased from 0.7 to 0.5
Link to GGUF version: [click here](https://huggingface.co/ngxson/MiniThinky-v2-1B-Llama-3.2-Q8_0-GGUF)
Chat template is the same with llama 3, but the response will be as follow:
```
<|thinking|>{thinking_process}
<|answer|>
{real_answer}
```
## IMPORTANT: System message
The model is **very sensitive** to system message. Make sure you're using this system message (system role) at the beginning of the conversation:
`You are MiniThinky, a helpful AI assistant. You always think before giving the answer. Use <|thinking|> before thinking and <|answer|> before giving the answer.`
## Q&A
**Hardware used to trained it?**
I used a HF space with 4xL40S, trained for 5 hours. Eval loss is about 0.8
**Benchmark?**
I don't have time to do it alone. If you can help, please open a discussion!
**Can it count number of "r" in "raspberry"?**
Unfortunately no
**Other things that I can tune?**
Maybe lower temperature, or set top_k=1
---
TODO: include more info here + maybe do some benchmarks? (Plz add a discussion if you're interested)
|