pstan commited on
Commit
84a2633
·
verified ·
1 Parent(s): ded8afc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ tags:
5
+ - ONNX
6
+ - DML
7
+ - DirectML
8
+ - ONNXRuntime
9
+ - mistral
10
+ - conversational
11
+ - custom_code
12
+ inference: false
13
+ ---
14
+
15
+ # Mistral-7B-Instruct-v0.3 ONNX
16
+
17
+ ## Model Summary
18
+
19
+ The [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) is an optimized version of the Mistral-7B model, fine-tuned for instruction-based tasks. This model is available in ONNX format to accelerate inference using ONNX Runtime, specifically optimized for CPU and DirectML. DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning, providing GPU acceleration for a wide range of supported hardware and drivers, including AMD, Intel, NVIDIA, and Qualcomm GPUs.
20
+
21
+ ## Model Description
22
+
23
+ - **Developed by:** Mistral AI
24
+ - **Model type:** ONNX
25
+ - **Language(s) (NLP):** Python, C, C++
26
+ - **License:** Apache License Version 2.0
27
+ - **Model Description:** This model is a conversion of the Mistral-7B-Instruct-v0.3 for ONNX Runtime inference, optimized for CPU and DirectML.
28
+
29
+ ## Usage
30
+
31
+ ### Installation and Setup
32
+
33
+ To use the Mistral-7B-Instruct-v0.3 ONNX model on Windows with DirectML, follow these steps:
34
+
35
+ 1. **Create and activate a Conda environment:**
36
+ ```sh
37
+ conda create -n onnx python=3.10
38
+ conda activate onnx
39
+ ```
40
+
41
+ 2. **Install Git LFS:**
42
+ ```sh
43
+ winget install -e --id GitHub.GitLFS
44
+ ```
45
+
46
+ 3. **Install Hugging Face CLI:**
47
+ ```sh
48
+ pip install huggingface-hub[cli]
49
+ ```
50
+
51
+ 4. **Download the model:**
52
+ ```sh
53
+ huggingface-cli download EmbeddedLLM/mistral-7b-instruct-v0.3-int4-onnx-directml --include directml/* --local-dir .\mistral-7b-instruct-v0.3
54
+ ```
55
+
56
+ 5. **Install necessary Python packages:**
57
+ ```sh
58
+ pip install numpy
59
+ pip install onnxruntime-directml
60
+ pip install --pre onnxruntime-genai-directml
61
+ ```
62
+
63
+ 6. **Install Visual Studio 2015 runtime:**
64
+ ```sh
65
+ conda install conda-forge::vs2015_runtime
66
+ ```
67
+
68
+ 7. **Download the example script:**
69
+ ```sh
70
+ Invoke-WebRequest -Uri "https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py" -OutFile "phi3-qa.py"
71
+ ```
72
+
73
+ 8. **Run the example script:**
74
+ ```sh
75
+ python phi3-qa.py -m .\mistral-7b-instruct-v0.3
76
+ ```
77
+
78
+ ### Hardware Requirements
79
+
80
+ - **Minimum Configuration:**
81
+ - **Windows:** DirectX 12-capable GPU (AMD/Nvidia)
82
+ - **CPU:** x86_64 / ARM64
83
+
84
+ - **Tested Configurations:**
85
+ - **GPU:** AMD Ryzen 8000 Series iGPU (DirectML)
86
+ - **CPU:** AMD Ryzen CPU
87
+
88
+ ## Optimized Configurations
89
+
90
+ The following optimized configurations are available:
91
+
92
+ 1. **ONNX model for int4 DML:** Optimized for AMD, Intel, and NVIDIA GPUs on Windows, quantized to int4.
93
+ 2. **ONNX model for int4 CPU:** Optimized for CPU, using int4 quantization.