image/jpeg

The flower of Ares.

These are the GGUF files of the fine-tuned model. To be compiled with llama.cpp on oobabooga or VLLm.

Fine-tuned on mistralai/Mistral-7B-v0.1...my team and I reformatted many different datasets and included a small amount of private stuff to see how much we could improve mistral.

I spoke to it personally for about an hour, and I believe we need to work on our format for the private dataset a bit more, but other than that, it turned out great. I will be uploading it to open llm evaluations, today.

Provided files

Name Quant method Bits Size Max RAM required Use case
Q2_K Tiny Q2_K 2 2.7 GB 4.7 GB smallest, significant quality loss - not recommended for most purposes
Q3_K_M Q3_K_M 3 3.52 GB 5.52 GB very small, high quality loss
Q4_0 Q4_0 4 4.11 GB 6.11 GB legacy; small, very high quality loss - prefer using Q3_K_M
Q4_K_M Q4_K_M 4 4.37 GB 6.37 GB medium, balanced quality - recommended
Q5_0 Q5_0 5 5 GB 7 GB legacy; large, balanced quality
Q5_K_M Q5_K_M 5 5.13 GB 7.13 GB large, balanced quality - recommended
Q6 XL Q6_K 6 5.94 GB 7.94 GB very large, extremely low quality loss
Q8 XXL Q8_0 8 7.7 GB 9.7 GB very large, extremely low quality loss - not recommended
  • Uses Mistral prompt template with chat-instruct.
Downloads last month
12
GGUF
Model size
7.24B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model's library. Check the docs .

Datasets used to train Kquant03/Hippolyta-7B-GGUF