README.md · mradermacher/orthorus-125b-v2-i1-GGUF at 3847d6556b049512ae354dfc9d1c9be94ab7cece

metadata

language:
  - en
library_name: transformers
license: apache-2.0
quantized_by: mradermacher

About

If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files.

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Link	Type	Size/GB	Notes
GGUF	i1-IQ1_S	26.8	for the desperate
GGUF	i1-IQ2_XXS	34.0
GGUF	i1-IQ2_XS	37.7
GGUF	i1-IQ2_S	39.0
GGUF	i1-IQ2_M	42.5
GGUF	i1-Q2_K	46.8	IQ3_XXS probably better
GGUF	i1-IQ3_XXS	49.1	fast, lower quality
PART 1 PART 2	i1-IQ3_XS	51.9
PART 1 PART 2	i1-IQ3_S	55.1	fast, beats Q3_K*
PART 1 PART 2	i1-IQ3_M	56.4

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):