Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,108 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model: tohur/natsumura-assistant-1.1-llama-3.1-8b
|
3 |
+
license: llama3.1
|
4 |
+
datasets:
|
5 |
+
- tohur/natsumura-identity
|
6 |
+
- cognitivecomputations/dolphin
|
7 |
+
- tohur/ultrachat_uncensored_sharegpt
|
8 |
+
- cognitivecomputations/dolphin-coder
|
9 |
+
- tohur/OpenHermes-2.5-Uncensored-ShareGPT
|
10 |
+
- tohur/Internal-Knowledge-Map-sharegpt
|
11 |
+
- m-a-p/Code-Feedback
|
12 |
+
- m-a-p/CodeFeedback-Filtered-Instruction
|
13 |
+
- cognitivecomputations/open-instruct-uncensored
|
14 |
+
- microsoft/orca-math-word-problems-200k
|
15 |
+
---
|
16 |
+
# natsumura-assistant-1.1-llama-3.1-8b-GGUF
|
17 |
+
This is my Storytelling/RP model for my Natsumura series of 8b models. This model is finetuned on storytelling and roleplaying datasets so should be a great model
|
18 |
+
to use for character chatbots in applications such as Sillytavern, Agnai, RisuAI and more. And should be a great model to use for fictional writing. Up to a 128k context.
|
19 |
+
|
20 |
+
- **Developed by:** Tohur
|
21 |
+
- **License:** llama3.1
|
22 |
+
- **Finetuned from model :** meta-llama/Meta-Llama-3.1-8B-Instruct
|
23 |
+
|
24 |
+
This model is based on meta-llama/Meta-Llama-3.1-8B-Instruct, and is governed by [Llama 3.1 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE)
|
25 |
+
Natsumura is uncensored, which makes the model compliant.It will be highly compliant with any requests, even unethical ones.
|
26 |
+
You are responsible for any content you create using this model. Please use it responsibly.
|
27 |
+
|
28 |
+
|
29 |
+
## Usage
|
30 |
+
|
31 |
+
If you are unsure how to use GGUF files, refer to one of [TheBloke's
|
32 |
+
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
|
33 |
+
more details, including on how to concatenate multi-part files.
|
34 |
+
|
35 |
+
## Provided Quants
|
36 |
+
|
37 |
+
(sorted by quality.)
|
38 |
+
|
39 |
+
| Quant | Notes |
|
40 |
+
|:-----|:-----|
|
41 |
+
| Q2_K |
|
42 |
+
| Q3_K_S |
|
43 |
+
| Q3_K_M | lower quality |
|
44 |
+
| Q3_K_L | |
|
45 |
+
| Q4_0 | |
|
46 |
+
| Q4_K_S | fast, recommended |
|
47 |
+
| Q4_K_M | fast, recommended |
|
48 |
+
| Q5_0 | |
|
49 |
+
| Q5_K_S | |
|
50 |
+
| Q5_K_M | |
|
51 |
+
| Q6_K | very good quality |
|
52 |
+
| Q8_0 | fast, best quality |
|
53 |
+
| f16 | 16 bpw, overkill |
|
54 |
+
|
55 |
+
# use in ollama
|
56 |
+
```
|
57 |
+
ollama pull Tohur/natsumura-storytelling-rp-llama-3.1
|
58 |
+
```
|
59 |
+
|
60 |
+
# Datasets used:
|
61 |
+
- tohur/natsumura-identity
|
62 |
+
- cognitivecomputations/dolphin
|
63 |
+
- tohur/ultrachat_uncensored_sharegpt
|
64 |
+
- cognitivecomputations/dolphin-coder
|
65 |
+
- tohur/OpenHermes-2.5-Uncensored-ShareGPT
|
66 |
+
- tohur/Internal-Knowledge-Map-sharegpt
|
67 |
+
- m-a-p/Code-Feedback
|
68 |
+
- m-a-p/CodeFeedback-Filtered-Instruction
|
69 |
+
- cognitivecomputations/open-instruct-uncensored
|
70 |
+
- microsoft/orca-math-word-problems-200k
|
71 |
+
|
72 |
+
The following parameters were used in [Llama Factory](https://github.com/hiyouga/LLaMA-Factory) during training:
|
73 |
+
- per_device_train_batch_size=2
|
74 |
+
- gradient_accumulation_steps=4
|
75 |
+
- lr_scheduler_type="cosine"
|
76 |
+
- logging_steps=10
|
77 |
+
- warmup_ratio=0.1
|
78 |
+
- save_steps=1000
|
79 |
+
- learning_rate=2e-5
|
80 |
+
- num_train_epochs=3.0
|
81 |
+
- max_samples=500
|
82 |
+
- max_grad_norm=1.0
|
83 |
+
- quantization_bit=4
|
84 |
+
- loraplus_lr_ratio=16.0
|
85 |
+
- fp16=True
|
86 |
+
|
87 |
+
## Inference
|
88 |
+
|
89 |
+
I use the following settings for inference:
|
90 |
+
```
|
91 |
+
"temperature": 1.0,
|
92 |
+
"repetition_penalty": 1.05,
|
93 |
+
"top_p": 0.95
|
94 |
+
"top_k": 40
|
95 |
+
"min_p": 0.05
|
96 |
+
```
|
97 |
+
|
98 |
+
# Prompt template: llama3
|
99 |
+
|
100 |
+
```
|
101 |
+
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
|
102 |
+
|
103 |
+
{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>
|
104 |
+
|
105 |
+
{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
106 |
+
|
107 |
+
{output}<|eot_id|>
|
108 |
+
```
|