comaniac
/

Mixtral-8x22B-Instruct-v0.1-FP8-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

comaniac commited on Jun 10, 2024

Commit

ac84ef2

•

1 Parent(s): fee9f43

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
-## Mixtral-8x22B-Instruct-v0.1-FP8-v1
 * Weights and activations are per-tensor quantized to float8_e4m3.
-* Quantization with AutoFP8.
 * Calibration dataset: Ultrachat (mgoin/ultrachat_2k)
 * Samples: 2048
 * Sequence length: 8192

+## Mixtral-8x22B-Instruct-v0.1-FP8-v2
 * Weights and activations are per-tensor quantized to float8_e4m3.
+* Quantization with AutoFP8 with updated activation scaling factor names.
 * Calibration dataset: Ultrachat (mgoin/ultrachat_2k)
 * Samples: 2048
 * Sequence length: 8192