EmbeddedLLM
/

mistral-7b-instruct-v0.3-onnx

Text Generation

Model card Files Files and versions Community

pstan commited on Jun 17, 2024

Commit

7c8a9c6

·

verified ·

1 Parent(s): 8225e59

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ DirectML is a high-performance, hardware-accelerated DirectX 12 library for mach
 ## ONNX Models
 Here are some of the optimized configurations we have added:
-- **ONNX model for int4 DML:** ONNX model for AMD, Intel, and NVIDIA GPUs on Windows, quantized to int4 using AWQ.
 - **ONNX model for int4 CPU and Mobile:** ONNX model for CPU and mobile using int4 quantization via RTN. There are two versions uploaded to balance latency vs. accuracy. Acc=1 is targeted at improved accuracy, while Acc=4 is for improved performance. For mobile devices, we recommend using the model with acc-level-4.
 ## Usage

 ## ONNX Models
 Here are some of the optimized configurations we have added:
+- **ONNX model for int4 DirectML:** ONNX model for AMD, Intel, and NVIDIA GPUs on Windows, quantized to int4 using AWQ.
 - **ONNX model for int4 CPU and Mobile:** ONNX model for CPU and mobile using int4 quantization via RTN. There are two versions uploaded to balance latency vs. accuracy. Acc=1 is targeted at improved accuracy, while Acc=4 is for improved performance. For mobile devices, we recommend using the model with acc-level-4.
 ## Usage