nintwentydo commited on
Commit
72f5202
·
verified ·
1 Parent(s): 4e8cca7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - fp8
4
+ - vllm
5
+ language:
6
+ - en
7
+ - de
8
+ - fr
9
+ - it
10
+ - pt
11
+ - hi
12
+ - es
13
+ - th
14
+ pipeline_tag: image-text-to-text
15
+ license: apache-2.0
16
+ library_name: vllm
17
+ base_model:
18
+ - mistral-community/pixtral-12b
19
+ - mistralai/Pixtral-12B-2409
20
+ base_model_relation: quantized
21
+ datasets:
22
+ - HuggingFaceH4/ultrachat_200k
23
+ ---
24
+
25
+ # Pixtral-12B-2409: FP8 Dynamic Quant + FP8 KV Cache
26
+
27
+ Quant of [mistral-community/pixtral-12b](https://huggingface.co/mistral-community/pixtral-12b) using [LLM Compressor](https://github.com/vllm-project/llm-compressor) for optimised inference on VLLM.
28
+
29
+ FP8 dynamic quant on language model, and FP8 quant of KV cache. multi_modal_projector and vision_tower left in FP16 since it's a small part of the model.
30
+
31
+ Calibrated on 2048 ultrachat samples.