Llamacpp Quantizations of h2o-danube2-1.8b-chat

Using llama.cpp release b2589 for quantization.

Original model: https://huggingface.co/h2oai/h2o-danube2-1.8b-chat

Download a file (not the whole branch) from below:

Filename Quant type File Size Description
h2o-danube2-1.8b-chat-Q8_0.gguf Q8_0 1.94GB Extremely high quality, generally unneeded but max available quant.
h2o-danube2-1.8b-chat-Q6_K.gguf Q6_K 1.50GB Very high quality, near perfect, recommended.
h2o-danube2-1.8b-chat-Q5_K_M.gguf Q5_K_M 1.30GB High quality, very usable.
h2o-danube2-1.8b-chat-Q5_K_S.gguf Q5_K_S 1.27GB High quality, very usable.
h2o-danube2-1.8b-chat-Q5_0.gguf Q5_0 1.27GB High quality, older format, generally not recommended.
h2o-danube2-1.8b-chat-Q4_K_M.gguf Q4_K_M 1.11GB Good quality, uses about 4.83 bits per weight.
h2o-danube2-1.8b-chat-Q4_K_S.gguf Q4_K_S 1.05GB Slightly lower quality with small space savings.
h2o-danube2-1.8b-chat-IQ4_NL.gguf IQ4_NL 1.06GB Decent quality, similar to Q4_K_S, new method of quanting,
h2o-danube2-1.8b-chat-IQ4_XS.gguf IQ4_XS 1.01GB Decent quality, new method with similar performance to Q4.
h2o-danube2-1.8b-chat-Q4_0.gguf Q4_0 1.05GB Decent quality, older format, generally not recommended.
h2o-danube2-1.8b-chat-Q3_K_L.gguf Q3_K_L .98GB Lower quality but usable, good for low RAM availability.
h2o-danube2-1.8b-chat-Q3_K_M.gguf Q3_K_M .90GB Even lower quality.
h2o-danube2-1.8b-chat-IQ3_M.gguf IQ3_M .85GB Medium-low quality, new method with decent performance.
h2o-danube2-1.8b-chat-IQ3_S.gguf IQ3_S .82GB Lower quality, new method with decent performance, recommended over Q3 quants.
h2o-danube2-1.8b-chat-Q3_K_S.gguf Q3_K_S .82GB Low quality, not recommended.
h2o-danube2-1.8b-chat-Q2_K.gguf Q2_K .71GB Extremely low quality, not recommended.

Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski

Downloads last month
89
GGUF
Model size
1.83B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.