zamroni111
/

Meta-Llama-3.1-8B-Instruct-ONNX-DirectML-GenAI-INT4

Text Generation

Model card Files Files and versions Community

zamroni111 commited on Sep 23, 2024

Commit

65ba526

·

verified ·

1 Parent(s): b01f270

Update README.md

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -67,14 +67,13 @@ c. Rename the original model.onnx to other file name and put and rename the opti
 d. Rerun step 4.
 #### Speeds, Sizes, Times [optional]
-15 token/s in Radeon 780M with 8GB dedicated RAM.<br>
 Increase to 16 token/s with device specific optimized model.onnx.<br>
 As comparison, LM Studio using GGUF INT4 model and VulkanML GPU acceleration runs at 13 token/s.
 #### Hardware
 AMD Ryzen Zen4 7840U with integrated Radeon 780M GPU<br>
 RAM 32GB<br>
-8GB pre-allocated iGPU VRAM
 #### Software
 Microsoft DirectML on Windows 10

 d. Rerun step 4.
 #### Speeds, Sizes, Times [optional]
+15 token/s in Radeon 780M with 8GB pre-allocated RAM.<br>
 Increase to 16 token/s with device specific optimized model.onnx.<br>
 As comparison, LM Studio using GGUF INT4 model and VulkanML GPU acceleration runs at 13 token/s.
 #### Hardware
 AMD Ryzen Zen4 7840U with integrated Radeon 780M GPU<br>
 RAM 32GB<br>
 #### Software
 Microsoft DirectML on Windows 10