Additional Quant Request:

by John198 - opened Nov 24, 2024

Nov 24, 2024

Hey so a few other people doing quants of Mistral Large also have a 2.75 bpw version that lets you run the model at about 40K context with q4 cache on 2x 3090s. Any chance you could do one as well? Normally I'd be fine with 2.25 but this as well as it's sister merge Monstral (https://huggingface.co/MarsupialAI/Monstral-123B) have both been rated pretty highly and I want them to be as good as they can be while testing on my rather VRAM poor setup.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment