Update README.md
Browse files
README.md
CHANGED
@@ -12,47 +12,29 @@ quantized_by: 3-3
|
|
12 |
tags:
|
13 |
- not-for-all-audiences
|
14 |
---
|
15 |
-
<!-- markdownlint-disable MD041 -->
|
16 |
-
|
17 |
-
<!-- header start -->
|
18 |
-
<!-- 200823 -->
|
19 |
-
<div style="width: auto; margin-left: auto; margin-right: auto">
|
20 |
-
|
21 |
-
# Using a modified README template by [TheBloke](https://huggingface.co/TheBloke),
|
22 |
-
### a gentleman and a scholar
|
23 |
-
|
24 |
-
</div>
|
25 |
-
<hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
|
26 |
-
<!-- header end -->
|
27 |
|
28 |
# Venus 120B v1.0 - GGUF
|
29 |
- Model creator: [nsfwthrowitaway69](https://huggingface.co/nsfwthrowitaway69)
|
30 |
- Original model: [Venus 120B v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
|
31 |
|
32 |
-
<!-- description start -->
|
33 |
## Description
|
34 |
|
35 |
-
|
36 |
|
37 |
-
|
38 |
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
|
43 |
-
|
|
44 |
-
| Venus-120b-v1.0.
|
45 |
-
| Venus-120b-v1.0.
|
46 |
-
| Venus-120b-v1.0.
|
47 |
-
| Venus-120b-v1.0.
|
48 |
-
| Venus-120b-v1.0.
|
49 |
-
| Venus-120b-v1.0.
|
50 |
-
| Venus-120b-v1.0.
|
51 |
-
| Venus-120b-v1.0.Q5_K_M.gguf | Q5_K_M | 5 | 83.22 GB| 85.72 GB | large, very low quality loss - recommended |
|
52 |
-
| Venus-120b-v1.0.Q6_K.gguf | Q6_K | 6 | 98.70 GB| 101.20 GB | very large, extremely low quality loss |
|
53 |
-
| Venus-120b-v1.0.Q8_0.gguf | Q8_0 | 8 | 127.84 GB| 130.34 GB | very large, extremely low quality loss - not recommended |
|
54 |
-
|
55 |
-
**Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
|
56 |
|
57 |
## All of the files are split and require joining
|
58 |
|
@@ -64,7 +46,7 @@ Download the two parts of your preferred quant. For `Q6_K` that would be:
|
|
64 |
* `Venus-120b-v1.0.Q6_K.gguf-split-b`
|
65 |
|
66 |
## Q8_0
|
67 |
-
Download the three parts of
|
68 |
* `Venus-120b-v1.0.Q8_0.gguf-split-a`
|
69 |
* `Venus-120b-v1.0.Q8_0.gguf-split-b`
|
70 |
* `Venus-120b-v1.0.Q8_0.gguf-split-c`
|
@@ -95,47 +77,3 @@ del Venus-120b-v1.0.Q6_K.gguf-split-a Venus-120b-v1.0.Q6_K.gguf-split-b
|
|
95 |
COPY /B Venus-120b-v1.0.Q8_0.gguf-split-a + Venus-120b-v1.0.Q8_0.gguf-split-b + Venus-120b-v1.0.Q8_0.gguf-split-c Venus-120b-v1.0.Q8_0.gguf
|
96 |
del Venus-120b-v1.0.Q8_0.gguf-split-a Venus-120b-v1.0.Q8_0.gguf-split-b Venus-120b-v1.0.Q8_0.gguf-split-c
|
97 |
```
|
98 |
-
|
99 |
-
<!-- README_GGUF.md-provided-files end -->
|
100 |
-
|
101 |
-
<!-- footer start -->
|
102 |
-
|
103 |
-
|
104 |
-
<!-- footer end -->
|
105 |
-
|
106 |
-
<!-- original-model-card start -->
|
107 |
-
# Original model card: nsfwthrowitaway69's Venus 120B v1.0
|
108 |
-
|
109 |
-
# Venus 120b - version 1.0
|
110 |
-
|
111 |
-
![image/png](https://cdn-uploads.huggingface.co/production/uploads/655febd724e0d359c1f21096/BSKlxWQSbh-liU8kGz4fF.png)
|
112 |
-
|
113 |
-
## Overview
|
114 |
-
|
115 |
-
The goal was to create a large model that's highly capable for RP/ERP scenarios. Venus-120b-v1.0 is excellent for roleplay, and Venus-120b was created with the idea of attempting to mix more than two models together to see how well this method works.
|
116 |
-
|
117 |
-
## Model Details
|
118 |
-
|
119 |
-
- A result of interleaving layers of [Sao10K/Euryale-1.3-L2-70B](https://huggingface.co/Sao10K/Euryale-1.3-L2-70B), [NousResearch/Nous-Hermes-Llama2-70b](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-70b), and [migtissera/SynthIA-70B-v1.5](https://huggingface.co/migtissera/SynthIA-70B-v1.5) using [mergekit](https://github.com/cg123/mergekit).
|
120 |
-
- The resulting model has 140 layers and approximately 122 billion parameters.
|
121 |
-
- See mergekit-config.yml for details on the merge method used.
|
122 |
-
- See the `exl2-*` branches for exllama2 quantizations. The 4.85 bpw quant should fit in 80GB VRAM, and the 3.0 bpw quant should (just barely) fit in 48GB VRAM with 4k context.
|
123 |
-
- Inspired by [Venus-120b-v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
|
124 |
-
|
125 |
-
**Warning: This model will produce NSFW content!**
|
126 |
-
|
127 |
-
## Results
|
128 |
-
|
129 |
-
Initial tests show that Venus-120b functions fine, overall it seems to be comparable to Venus-120b-v1.0. Some differences I noticed:
|
130 |
-
1. Venus needs lower temperature settings than Goliath. I recommend a temp of around 0.7, and no higher than 1.0.
|
131 |
-
2. Venus tends to, on average, produce longer responses than Goliath. Probably due to the inclusion of SynthIA in the merge, which is trained to produce long chain-of-thought responses.
|
132 |
-
3. Venus seems to be a bit less creative than Goliath when it comes to the prose it generates. Probably due to the lack of Xwin and the inclusion of Nous-Hermes.
|
133 |
-
|
134 |
-
Keep in mind this is all anecdotal from some basic tests. The key takeaway is that Venus shows that Goliath is not a fluke.
|
135 |
-
|
136 |
-
## Other quants:
|
137 |
-
|
138 |
-
- 4.5 bpw exl2 quant provided by Panchovix: https://huggingface.co/Panchovix/Venus-120b-v1.0-4.5bpw-h6-exl2
|
139 |
-
- 4.25 bpw exl2 quant provided by Panchovix: https://huggingface.co/Panchovix/Venus-120b-v1.0-4.25bpw-h6-exl2
|
140 |
-
|
141 |
-
<!-- original-model-card end -->
|
|
|
12 |
tags:
|
13 |
- not-for-all-audiences
|
14 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
|
16 |
# Venus 120B v1.0 - GGUF
|
17 |
- Model creator: [nsfwthrowitaway69](https://huggingface.co/nsfwthrowitaway69)
|
18 |
- Original model: [Venus 120B v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
|
19 |
|
|
|
20 |
## Description
|
21 |
|
22 |
+
GGUF quants for [nsfwthrowitaway69's Venus 120B v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0).
|
23 |
|
24 |
+
## Provided quants
|
25 |
|
26 |
+
| Name | Quant method | Size |
|
27 |
+
| ---- | ---- | ---- | ---- |
|
28 |
+
| Venus-120b-v1.0.Q2_K.gguf | Q2_K | 50.71 GB|
|
29 |
+
| Venus-120b-v1.0.Q3_K_S.gguf | Q3_K_S | 50.71 GB|
|
30 |
+
| Venus-120b-v1.0.Q3_K_M.gguf | Q3_K_M | 56.41 GB|
|
31 |
+
| Venus-120b-v1.0.Q3_K_L.gguf | Q3_K_L | 61.67 GB|
|
32 |
+
| Venus-120b-v1.0.Q4_K_S.gguf | Q4_K_S | 66.43 GB|
|
33 |
+
| Venus-120b-v1.0.Q4_K_M.gguf | Q4_K_M | 70.64 GB|
|
34 |
+
| Venus-120b-v1.0.Q5_K_S.gguf | Q5_K_S | 81.00 GB|
|
35 |
+
| Venus-120b-v1.0.Q5_K_M.gguf | Q5_K_M | 83.22 GB|
|
36 |
+
| Venus-120b-v1.0.Q6_K.gguf | Q6_K | 98.70 GB|
|
37 |
+
| Venus-120b-v1.0.Q8_0.gguf | Q8_0 | 127.84 GB|
|
|
|
|
|
|
|
|
|
|
|
38 |
|
39 |
## All of the files are split and require joining
|
40 |
|
|
|
46 |
* `Venus-120b-v1.0.Q6_K.gguf-split-b`
|
47 |
|
48 |
## Q8_0
|
49 |
+
Download the three parts of the `Q8_0` quant:
|
50 |
* `Venus-120b-v1.0.Q8_0.gguf-split-a`
|
51 |
* `Venus-120b-v1.0.Q8_0.gguf-split-b`
|
52 |
* `Venus-120b-v1.0.Q8_0.gguf-split-c`
|
|
|
77 |
COPY /B Venus-120b-v1.0.Q8_0.gguf-split-a + Venus-120b-v1.0.Q8_0.gguf-split-b + Venus-120b-v1.0.Q8_0.gguf-split-c Venus-120b-v1.0.Q8_0.gguf
|
78 |
del Venus-120b-v1.0.Q8_0.gguf-split-a Venus-120b-v1.0.Q8_0.gguf-split-b Venus-120b-v1.0.Q8_0.gguf-split-c
|
79 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|