3-3 commited on
Commit
0500e12
·
1 Parent(s): 0421451

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -77
README.md CHANGED
@@ -12,47 +12,29 @@ quantized_by: 3-3
12
  tags:
13
  - not-for-all-audiences
14
  ---
15
- <!-- markdownlint-disable MD041 -->
16
-
17
- <!-- header start -->
18
- <!-- 200823 -->
19
- <div style="width: auto; margin-left: auto; margin-right: auto">
20
-
21
- # Using a modified README template by [TheBloke](https://huggingface.co/TheBloke),
22
- ### a gentleman and a scholar
23
-
24
- </div>
25
- <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
26
- <!-- header end -->
27
 
28
  # Venus 120B v1.0 - GGUF
29
  - Model creator: [nsfwthrowitaway69](https://huggingface.co/nsfwthrowitaway69)
30
  - Original model: [Venus 120B v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
31
 
32
- <!-- description start -->
33
  ## Description
34
 
35
- This repo contains GGUF format model files for [nsfwthrowitaway69's Venus 120B v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0).
36
 
37
- <!-- description end -->
38
 
39
- <!-- README_GGUF.md-provided-files start -->
40
- ## Provided files
41
-
42
- | Name | Quant method | Bits | Size | Max RAM required | Use case |
43
- | ---- | ---- | ---- | ---- | ---- | ----- |
44
- | Venus-120b-v1.0.Q2_K.gguf | Q2_K | 2 | 50.71 GB| 53.21 GB | smallest, significant quality loss - not recommended for most purposes |
45
- | Venus-120b-v1.0.Q3_K_S.gguf | Q3_K_S | 3 | 50.71 GB| 53.21 GB | very small, high quality loss |
46
- | Venus-120b-v1.0.Q3_K_M.gguf | Q3_K_M | 3 | 56.41 GB| 58.91 GB | very small, high quality loss |
47
- | Venus-120b-v1.0.Q3_K_L.gguf | Q3_K_L | 3 | 61.67 GB| 64.17 GB | small, substantial quality loss |
48
- | Venus-120b-v1.0.Q4_K_S.gguf | Q4_K_S | 4 | 66.43 GB| 68.93 GB | small, greater quality loss |
49
- | Venus-120b-v1.0.Q4_K_M.gguf | Q4_K_M | 4 | 70.64 GB| 73.14 GB | medium, balanced quality - recommended |
50
- | Venus-120b-v1.0.Q5_K_S.gguf | Q5_K_S | 5 | 81.00 GB| 83.50 GB | large, low quality loss - recommended |
51
- | Venus-120b-v1.0.Q5_K_M.gguf | Q5_K_M | 5 | 83.22 GB| 85.72 GB | large, very low quality loss - recommended |
52
- | Venus-120b-v1.0.Q6_K.gguf | Q6_K | 6 | 98.70 GB| 101.20 GB | very large, extremely low quality loss |
53
- | Venus-120b-v1.0.Q8_0.gguf | Q8_0 | 8 | 127.84 GB| 130.34 GB | very large, extremely low quality loss - not recommended |
54
-
55
- **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
56
 
57
  ## All of the files are split and require joining
58
 
@@ -64,7 +46,7 @@ Download the two parts of your preferred quant. For `Q6_K` that would be:
64
  * `Venus-120b-v1.0.Q6_K.gguf-split-b`
65
 
66
  ## Q8_0
67
- Download the three parts of this quant:
68
  * `Venus-120b-v1.0.Q8_0.gguf-split-a`
69
  * `Venus-120b-v1.0.Q8_0.gguf-split-b`
70
  * `Venus-120b-v1.0.Q8_0.gguf-split-c`
@@ -95,47 +77,3 @@ del Venus-120b-v1.0.Q6_K.gguf-split-a Venus-120b-v1.0.Q6_K.gguf-split-b
95
  COPY /B Venus-120b-v1.0.Q8_0.gguf-split-a + Venus-120b-v1.0.Q8_0.gguf-split-b + Venus-120b-v1.0.Q8_0.gguf-split-c Venus-120b-v1.0.Q8_0.gguf
96
  del Venus-120b-v1.0.Q8_0.gguf-split-a Venus-120b-v1.0.Q8_0.gguf-split-b Venus-120b-v1.0.Q8_0.gguf-split-c
97
  ```
98
-
99
- <!-- README_GGUF.md-provided-files end -->
100
-
101
- <!-- footer start -->
102
-
103
-
104
- <!-- footer end -->
105
-
106
- <!-- original-model-card start -->
107
- # Original model card: nsfwthrowitaway69's Venus 120B v1.0
108
-
109
- # Venus 120b - version 1.0
110
-
111
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/655febd724e0d359c1f21096/BSKlxWQSbh-liU8kGz4fF.png)
112
-
113
- ## Overview
114
-
115
- The goal was to create a large model that's highly capable for RP/ERP scenarios. Venus-120b-v1.0 is excellent for roleplay, and Venus-120b was created with the idea of attempting to mix more than two models together to see how well this method works.
116
-
117
- ## Model Details
118
-
119
- - A result of interleaving layers of [Sao10K/Euryale-1.3-L2-70B](https://huggingface.co/Sao10K/Euryale-1.3-L2-70B), [NousResearch/Nous-Hermes-Llama2-70b](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-70b), and [migtissera/SynthIA-70B-v1.5](https://huggingface.co/migtissera/SynthIA-70B-v1.5) using [mergekit](https://github.com/cg123/mergekit).
120
- - The resulting model has 140 layers and approximately 122 billion parameters.
121
- - See mergekit-config.yml for details on the merge method used.
122
- - See the `exl2-*` branches for exllama2 quantizations. The 4.85 bpw quant should fit in 80GB VRAM, and the 3.0 bpw quant should (just barely) fit in 48GB VRAM with 4k context.
123
- - Inspired by [Venus-120b-v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
124
-
125
- **Warning: This model will produce NSFW content!**
126
-
127
- ## Results
128
-
129
- Initial tests show that Venus-120b functions fine, overall it seems to be comparable to Venus-120b-v1.0. Some differences I noticed:
130
- 1. Venus needs lower temperature settings than Goliath. I recommend a temp of around 0.7, and no higher than 1.0.
131
- 2. Venus tends to, on average, produce longer responses than Goliath. Probably due to the inclusion of SynthIA in the merge, which is trained to produce long chain-of-thought responses.
132
- 3. Venus seems to be a bit less creative than Goliath when it comes to the prose it generates. Probably due to the lack of Xwin and the inclusion of Nous-Hermes.
133
-
134
- Keep in mind this is all anecdotal from some basic tests. The key takeaway is that Venus shows that Goliath is not a fluke.
135
-
136
- ## Other quants:
137
-
138
- - 4.5 bpw exl2 quant provided by Panchovix: https://huggingface.co/Panchovix/Venus-120b-v1.0-4.5bpw-h6-exl2
139
- - 4.25 bpw exl2 quant provided by Panchovix: https://huggingface.co/Panchovix/Venus-120b-v1.0-4.25bpw-h6-exl2
140
-
141
- <!-- original-model-card end -->
 
12
  tags:
13
  - not-for-all-audiences
14
  ---
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  # Venus 120B v1.0 - GGUF
17
  - Model creator: [nsfwthrowitaway69](https://huggingface.co/nsfwthrowitaway69)
18
  - Original model: [Venus 120B v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
19
 
 
20
  ## Description
21
 
22
+ GGUF quants for [nsfwthrowitaway69's Venus 120B v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0).
23
 
24
+ ## Provided quants
25
 
26
+ | Name | Quant method | Size |
27
+ | ---- | ---- | ---- | ---- |
28
+ | Venus-120b-v1.0.Q2_K.gguf | Q2_K | 50.71 GB|
29
+ | Venus-120b-v1.0.Q3_K_S.gguf | Q3_K_S | 50.71 GB|
30
+ | Venus-120b-v1.0.Q3_K_M.gguf | Q3_K_M | 56.41 GB|
31
+ | Venus-120b-v1.0.Q3_K_L.gguf | Q3_K_L | 61.67 GB|
32
+ | Venus-120b-v1.0.Q4_K_S.gguf | Q4_K_S | 66.43 GB|
33
+ | Venus-120b-v1.0.Q4_K_M.gguf | Q4_K_M | 70.64 GB|
34
+ | Venus-120b-v1.0.Q5_K_S.gguf | Q5_K_S | 81.00 GB|
35
+ | Venus-120b-v1.0.Q5_K_M.gguf | Q5_K_M | 83.22 GB|
36
+ | Venus-120b-v1.0.Q6_K.gguf | Q6_K | 98.70 GB|
37
+ | Venus-120b-v1.0.Q8_0.gguf | Q8_0 | 127.84 GB|
 
 
 
 
 
38
 
39
  ## All of the files are split and require joining
40
 
 
46
  * `Venus-120b-v1.0.Q6_K.gguf-split-b`
47
 
48
  ## Q8_0
49
+ Download the three parts of the `Q8_0` quant:
50
  * `Venus-120b-v1.0.Q8_0.gguf-split-a`
51
  * `Venus-120b-v1.0.Q8_0.gguf-split-b`
52
  * `Venus-120b-v1.0.Q8_0.gguf-split-c`
 
77
  COPY /B Venus-120b-v1.0.Q8_0.gguf-split-a + Venus-120b-v1.0.Q8_0.gguf-split-b + Venus-120b-v1.0.Q8_0.gguf-split-c Venus-120b-v1.0.Q8_0.gguf
78
  del Venus-120b-v1.0.Q8_0.gguf-split-a Venus-120b-v1.0.Q8_0.gguf-split-b Venus-120b-v1.0.Q8_0.gguf-split-c
79
  ```