File size: 3,909 Bytes
6da6fe6 94a920d 6da6fe6 94a920d 2fb7058 94a920d 6da6fe6 110e177 6da6fe6 2fb7058 0d6b15b 1ece0cb 0d6b15b 1ece0cb 0d6b15b 1ece0cb 0d6b15b 7333c42 0d6b15b c48d3a9 6da6fe6 94a920d 43a010a 110e177 43a010a 110e177 3b9837f 73728fc 6da6fe6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
---
datasets:
- ehartford/samantha-data
language:
- en
library_name: transformers
license: llama2
quantized_by: mradermacher
---
## About
weighted/imatrix quants of https://huggingface.co/cognitivecomputations/Samantha-1.11-70b
<!-- provided-files -->
## Usage
If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.
## Provided Quants
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ1_S.gguf) | i1-IQ1_S | 15.0 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 18.7 | |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ2_XS.gguf) | i1-IQ2_XS | 20.8 | |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ2_M.gguf) | i1-IQ2_M | 23.7 | |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q2_K.gguf) | i1-Q2_K | 25.9 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 27.4 | fast, lower quality |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ3_XS.gguf) | i1-IQ3_XS | 28.6 | |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_XS.gguf) | i1-Q3_K_XS | 28.7 | |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ3_S.gguf) | i1-IQ3_S | 30.3 | fast, beats Q3_K* |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_S.gguf) | i1-Q3_K_S | 30.3 | IQ3_XS probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_M.gguf) | i1-Q3_K_M | 33.7 | IQ3_S probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_L.gguf) | i1-Q3_K_L | 36.6 | IQ3_M probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q4_K_S.gguf) | i1-Q4_K_S | 39.7 | optimal size/speed/quality |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q4_K_M.gguf) | i1-Q4_K_M | 41.8 | fast, medium quality |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q5_K_S.gguf) | i1-Q5_K_S | 47.9 | |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q5_K_M.gguf) | i1-Q5_K_M | 49.2 | |
| [PART 1](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q6_K.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q6_K.gguf.part2of2) | i1-Q6_K | 57.0 | practically like static Q6_K |
Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):
![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)
And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9
## Thanks
I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time.
<!-- end -->
|