3-3
/

Venus-120b-v1.0-GGUF

Text Generation

Transformers

English

llama

Not-For-All-Audiences

conversational

Model card Files Files and versions Community

3-3 commited on Nov 27, 2023

Commit

0500e12

1 Parent(s): 0421451

Update README.md

Browse files

Files changed (1) hide show

README.md +15 -77

README.md CHANGED Viewed

@@ -12,47 +12,29 @@ quantized_by: 3-3
 tags:
   - not-for-all-audiences
 ---
-<!-- markdownlint-disable MD041 -->
-<!-- header start -->
-<!-- 200823 -->
-<div style="width: auto; margin-left: auto; margin-right: auto">
-# Using a modified README template by [TheBloke](https://huggingface.co/TheBloke),
-### a gentleman and a scholar
-</div>
-<hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
-<!-- header end -->
 # Venus 120B v1.0 - GGUF
 - Model creator: [nsfwthrowitaway69](https://huggingface.co/nsfwthrowitaway69)
 - Original model: [Venus 120B v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
-<!-- description start -->
 ## Description
-This repo contains GGUF format model files for [nsfwthrowitaway69's Venus 120B v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0).
-<!-- description end -->
-<!-- README_GGUF.md-provided-files start -->
-## Provided files
-| Name | Quant method | Bits | Size | Max RAM required | Use case |
-| ---- | ---- | ---- | ---- | ---- | ----- |
-| Venus-120b-v1.0.Q2_K.gguf | Q2_K | 2 | 50.71 GB| 53.21 GB | smallest, significant quality loss - not recommended for most purposes |
-| Venus-120b-v1.0.Q3_K_S.gguf | Q3_K_S | 3 | 50.71 GB| 53.21 GB | very small, high quality loss |
-| Venus-120b-v1.0.Q3_K_M.gguf | Q3_K_M | 3 | 56.41 GB| 58.91 GB | very small, high quality loss |
-| Venus-120b-v1.0.Q3_K_L.gguf | Q3_K_L | 3 | 61.67 GB| 64.17 GB | small, substantial quality loss |
-| Venus-120b-v1.0.Q4_K_S.gguf | Q4_K_S | 4 | 66.43 GB| 68.93 GB | small, greater quality loss |
-| Venus-120b-v1.0.Q4_K_M.gguf | Q4_K_M | 4 | 70.64 GB| 73.14 GB | medium, balanced quality - recommended |
-| Venus-120b-v1.0.Q5_K_S.gguf | Q5_K_S | 5 | 81.00 GB| 83.50 GB | large, low quality loss - recommended |
-| Venus-120b-v1.0.Q5_K_M.gguf | Q5_K_M | 5 | 83.22 GB| 85.72 GB | large, very low quality loss - recommended |
-| Venus-120b-v1.0.Q6_K.gguf | Q6_K | 6 | 98.70 GB| 101.20 GB | very large, extremely low quality loss |
-| Venus-120b-v1.0.Q8_0.gguf | Q8_0 | 8 | 127.84 GB| 130.34 GB | very large, extremely low quality loss - not recommended |
-**Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
 ## All of the files are split and require joining
@@ -64,7 +46,7 @@ Download the two parts of your preferred quant. For `Q6_K` that would be:
 * `Venus-120b-v1.0.Q6_K.gguf-split-b`
 ## Q8_0
-Download the three parts of this quant:
 * `Venus-120b-v1.0.Q8_0.gguf-split-a`
 * `Venus-120b-v1.0.Q8_0.gguf-split-b`
 * `Venus-120b-v1.0.Q8_0.gguf-split-c`
@@ -95,47 +77,3 @@ del Venus-120b-v1.0.Q6_K.gguf-split-a Venus-120b-v1.0.Q6_K.gguf-split-b
 COPY /B Venus-120b-v1.0.Q8_0.gguf-split-a + Venus-120b-v1.0.Q8_0.gguf-split-b + Venus-120b-v1.0.Q8_0.gguf-split-c Venus-120b-v1.0.Q8_0.gguf
 del Venus-120b-v1.0.Q8_0.gguf-split-a Venus-120b-v1.0.Q8_0.gguf-split-b Venus-120b-v1.0.Q8_0.gguf-split-c
 ```
-<!-- README_GGUF.md-provided-files end -->
-<!-- footer start -->
-<!-- footer end -->
-<!-- original-model-card start -->
-# Original model card: nsfwthrowitaway69's Venus 120B v1.0
-# Venus 120b - version 1.0
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/655febd724e0d359c1f21096/BSKlxWQSbh-liU8kGz4fF.png)
-## Overview
-The goal was to create a large model that's highly capable for RP/ERP scenarios. Venus-120b-v1.0 is excellent for roleplay, and Venus-120b was created with the idea of attempting to mix more than two models together to see how well this method works.
-## Model Details
-- A result of interleaving layers of [Sao10K/Euryale-1.3-L2-70B](https://huggingface.co/Sao10K/Euryale-1.3-L2-70B), [NousResearch/Nous-Hermes-Llama2-70b](https://huggingface.co/NousResearch/Nous-Hermes-Llama2-70b), and [migtissera/SynthIA-70B-v1.5](https://huggingface.co/migtissera/SynthIA-70B-v1.5) using [mergekit](https://github.com/cg123/mergekit).
-- The resulting model has 140 layers and approximately 122 billion parameters.
-- See mergekit-config.yml for details on the merge method used.
-- See the `exl2-*` branches for exllama2 quantizations. The 4.85 bpw quant should fit in 80GB VRAM, and the 3.0 bpw quant should (just barely) fit in 48GB VRAM with 4k context.
-- Inspired by [Venus-120b-v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
-**Warning: This model will produce NSFW content!**
-## Results
-Initial tests show that Venus-120b functions fine, overall it seems to be comparable to Venus-120b-v1.0. Some differences I noticed:
-1. Venus needs lower temperature settings than Goliath. I recommend a temp of around 0.7, and no higher than 1.0.
-2. Venus tends to, on average, produce longer responses than Goliath. Probably due to the inclusion of SynthIA in the merge, which is trained to produce long chain-of-thought responses.
-3. Venus seems to be a bit less creative than Goliath when it comes to the prose it generates. Probably due to the lack of Xwin and the inclusion of Nous-Hermes.
-Keep in mind this is all anecdotal from some basic tests. The key takeaway is that Venus shows that Goliath is not a fluke.
-## Other quants:
-- 4.5 bpw exl2 quant provided by Panchovix: https://huggingface.co/Panchovix/Venus-120b-v1.0-4.5bpw-h6-exl2
-- 4.25 bpw exl2 quant provided by Panchovix: https://huggingface.co/Panchovix/Venus-120b-v1.0-4.25bpw-h6-exl2
-<!-- original-model-card end -->

 tags:
   - not-for-all-audiences
 ---
 # Venus 120B v1.0 - GGUF
 - Model creator: [nsfwthrowitaway69](https://huggingface.co/nsfwthrowitaway69)
 - Original model: [Venus 120B v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0)
 ## Description
+GGUF quants for [nsfwthrowitaway69's Venus 120B v1.0](https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0).
+## Provided quants
+| Name | Quant method | Size |
+| ---- | ---- | ---- | ---- |
+| Venus-120b-v1.0.Q2_K.gguf | Q2_K | 50.71 GB|
+| Venus-120b-v1.0.Q3_K_S.gguf | Q3_K_S | 50.71 GB|
+| Venus-120b-v1.0.Q3_K_M.gguf | Q3_K_M | 56.41 GB|
+| Venus-120b-v1.0.Q3_K_L.gguf | Q3_K_L | 61.67 GB|
+| Venus-120b-v1.0.Q4_K_S.gguf | Q4_K_S | 66.43 GB|
+| Venus-120b-v1.0.Q4_K_M.gguf | Q4_K_M | 70.64 GB|
+| Venus-120b-v1.0.Q5_K_S.gguf | Q5_K_S | 81.00 GB|
+| Venus-120b-v1.0.Q5_K_M.gguf | Q5_K_M | 83.22 GB|
+| Venus-120b-v1.0.Q6_K.gguf | Q6_K | 98.70 GB|
+| Venus-120b-v1.0.Q8_0.gguf | Q8_0 | 127.84 GB|
 ## All of the files are split and require joining
 * `Venus-120b-v1.0.Q6_K.gguf-split-b`
 ## Q8_0
+Download the three parts of the `Q8_0` quant:
 * `Venus-120b-v1.0.Q8_0.gguf-split-a`
 * `Venus-120b-v1.0.Q8_0.gguf-split-b`
 * `Venus-120b-v1.0.Q8_0.gguf-split-c`
 COPY /B Venus-120b-v1.0.Q8_0.gguf-split-a + Venus-120b-v1.0.Q8_0.gguf-split-b + Venus-120b-v1.0.Q8_0.gguf-split-c Venus-120b-v1.0.Q8_0.gguf
 del Venus-120b-v1.0.Q8_0.gguf-split-a Venus-120b-v1.0.Q8_0.gguf-split-b Venus-120b-v1.0.Q8_0.gguf-split-c
 ```