Additional Quant Request:
#1
by
John198
- opened
Hey so a few other people doing quants of Mistral Large also have a 2.75 bpw version that lets you run the model at about 40K context with q4 cache on 2x 3090s. Any chance you could do one as well? Normally I'd be fine with 2.25 but this as well as it's sister merge Monstral (https://huggingface.co/MarsupialAI/Monstral-123B) have both been rated pretty highly and I want them to be as good as they can be while testing on my rather VRAM poor setup.