GGUF quants of TeeZee/Kyllene-57B-v1.0, remeber to set your max context length to proper length for your hardware, 4096 is fine. Default context length is 200k so it will eat RAM or VRAM like crazy if left unchecked.

Downloads last month
31
GGUF
Model size
56.7B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Collection including TeeZee/Kyllene-57B-v1.0-GGUF