shreyajn commited on
Commit
0d59273
·
verified ·
1 Parent(s): 64a5a71

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +77 -78
README.md CHANGED
@@ -38,47 +38,46 @@ More details on model performance across various devices, can be found
38
 
39
  | Model | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
40
  |---|---|---|---|---|---|---|---|---|
41
- | WhisperEncoder | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 | TFLITE | 722.7 ms | 69 - 449 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
42
- | WhisperEncoder | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 | QNN | 820.248 ms | 0 - 209 MB | FP16 | NPU | [Whisper-Small-En.so](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.so) |
43
- | WhisperEncoder | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 | TFLITE | 518.95 ms | 111 - 201 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
44
- | WhisperEncoder | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 | ONNX | 778.511 ms | 113 - 3977 MB | FP16 | NPU | [Whisper-Small-En.onnx](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.onnx) |
45
- | WhisperEncoder | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | QNN | 516.678 ms | 0 - 906 MB | FP16 | NPU | Use Export Script |
46
- | WhisperEncoder | QCS8550 (Proxy) | QCS8550 Proxy | TFLITE | 696.316 ms | 85 - 465 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
47
- | WhisperEncoder | QCS8550 (Proxy) | QCS8550 Proxy | QNN | 644.566 ms | 1 - 2 MB | FP16 | NPU | Use Export Script |
48
- | WhisperEncoder | SA7255P ADP | SA7255P | TFLITE | 4426.504 ms | 108 - 142 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
49
- | WhisperEncoder | SA7255P ADP | SA7255P | QNN | 3210.318 ms | 1 - 8 MB | FP16 | NPU | Use Export Script |
50
- | WhisperEncoder | SA8255 (Proxy) | SA8255P Proxy | TFLITE | 705.299 ms | 26 - 406 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
51
- | WhisperEncoder | SA8255 (Proxy) | SA8255P Proxy | QNN | 638.347 ms | 1 - 2 MB | FP16 | NPU | Use Export Script |
52
- | WhisperEncoder | SA8295P ADP | SA8295P | QNN | 700.683 ms | 3 - 9 MB | FP16 | NPU | Use Export Script |
53
- | WhisperEncoder | SA8650 (Proxy) | SA8650P Proxy | TFLITE | 709.764 ms | 78 - 445 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
54
- | WhisperEncoder | SA8650 (Proxy) | SA8650P Proxy | QNN | 678.71 ms | 1 - 3 MB | FP16 | NPU | Use Export Script |
55
- | WhisperEncoder | SA8775P ADP | SA8775P | TFLITE | 1293.65 ms | 108 - 140 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
56
- | WhisperEncoder | SA8775P ADP | SA8775P | QNN | 603.983 ms | 1 - 6 MB | FP16 | NPU | Use Export Script |
57
- | WhisperEncoder | QCS8450 (Proxy) | QCS8450 Proxy | TFLITE | 969.375 ms | 110 - 205 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
58
- | WhisperEncoder | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 504.049 ms | 0 - 0 MB | FP16 | NPU | Use Export Script |
59
- | WhisperEncoder | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 1342.641 ms | 237 - 237 MB | FP16 | NPU | [Whisper-Small-En.onnx](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.onnx) |
60
- | WhisperDecoder | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 | TFLITE | 28.657 ms | 16 - 100 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
61
- | WhisperDecoder | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 | QNN | 11.929 ms | 61 - 141 MB | FP16 | NPU | [Whisper-Small-En.so](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.so) |
62
- | WhisperDecoder | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 | ONNX | 58.778 ms | 120 - 123 MB | FP16 | NPU | [Whisper-Small-En.onnx](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.onnx) |
63
- | WhisperDecoder | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 | TFLITE | 23.885 ms | 16 - 148 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
64
- | WhisperDecoder | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 | QNN | 9.45 ms | 446 - 552 MB | FP16 | NPU | [Whisper-Small-En.so](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.so) |
65
- | WhisperDecoder | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 | ONNX | 47.995 ms | 85 - 1135 MB | FP16 | NPU | [Whisper-Small-En.onnx](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.onnx) |
66
- | WhisperDecoder | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | TFLITE | 16.628 ms | 16 - 263 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
67
- | WhisperDecoder | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | QNN | 8.06 ms | 53 - 188 MB | FP16 | NPU | Use Export Script |
68
- | WhisperDecoder | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | ONNX | 44.088 ms | 69 - 697 MB | FP16 | NPU | [Whisper-Small-En.onnx](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.onnx) |
69
- | WhisperDecoder | QCS8550 (Proxy) | QCS8550 Proxy | TFLITE | 28.65 ms | 16 - 101 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
70
- | WhisperDecoder | QCS8550 (Proxy) | QCS8550 Proxy | QNN | 11.92 ms | 61 - 71 MB | FP16 | NPU | Use Export Script |
71
- | WhisperDecoder | SA7255P ADP | SA7255P | QNN | 74.962 ms | 56 - 64 MB | FP16 | NPU | Use Export Script |
72
- | WhisperDecoder | SA8255 (Proxy) | SA8255P Proxy | TFLITE | 29.533 ms | 16 - 99 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
73
- | WhisperDecoder | SA8255 (Proxy) | SA8255P Proxy | QNN | 12.125 ms | 57 - 62 MB | FP16 | NPU | Use Export Script |
74
- | WhisperDecoder | SA8295P ADP | SA8295P | TFLITE | 30.807 ms | 16 - 162 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
75
- | WhisperDecoder | SA8295P ADP | SA8295P | QNN | 14.596 ms | 57 - 62 MB | FP16 | NPU | Use Export Script |
76
- | WhisperDecoder | SA8650 (Proxy) | SA8650P Proxy | TFLITE | 29.43 ms | 16 - 99 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
77
- | WhisperDecoder | SA8650 (Proxy) | SA8650P Proxy | QNN | 12.052 ms | 65 - 66 MB | FP16 | NPU | Use Export Script |
78
- | WhisperDecoder | SA8775P ADP | SA8775P | TFLITE | 33.02 ms | 16 - 174 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
79
- | WhisperDecoder | QCS8450 (Proxy) | QCS8450 Proxy | TFLITE | 34.145 ms | 16 - 139 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
80
- | WhisperDecoder | QCS8450 (Proxy) | QCS8450 Proxy | QNN | 15.967 ms | 57 - 173 MB | FP16 | NPU | Use Export Script |
81
- | WhisperDecoder | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 52.917 ms | 232 - 232 MB | FP16 | NPU | [Whisper-Small-En.onnx](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.onnx) |
82
 
83
 
84
 
@@ -139,23 +138,23 @@ python -m qai_hub_models.models.whisper_small_en.export
139
  ```
140
  ```
141
  Profiling Results
142
- ------------------------------------------------------------
143
- WhisperEncoder
144
- Device : Samsung Galaxy S23 (13)
145
- Runtime : TFLITE
146
- Estimated inference time (ms) : 722.7
147
- Estimated peak memory usage (MB): [69, 449]
148
- Total # Ops : 911
149
- Compute Unit(s) : GPU (900 ops) CPU (11 ops)
150
-
151
  ------------------------------------------------------------
152
  WhisperDecoder
153
  Device : Samsung Galaxy S23 (13)
154
  Runtime : TFLITE
155
- Estimated inference time (ms) : 28.7
156
- Estimated peak memory usage (MB): [16, 100]
157
  Total # Ops : 2573
158
  Compute Unit(s) : NPU (2573 ops)
 
 
 
 
 
 
 
 
 
159
  ```
160
 
161
 
@@ -178,42 +177,42 @@ from qai_hub_models.models.whisper_small_en import Model
178
 
179
  # Load the model
180
  model = Model.from_pretrained()
181
- encoder_model = model.encoder
182
  decoder_model = model.decoder
 
183
 
184
  # Device
185
  device = hub.Device("Samsung Galaxy S23")
186
 
187
  # Trace model
188
- encoder_input_shape = encoder_model.get_input_spec()
189
- encoder_sample_inputs = encoder_model.sample_inputs()
190
 
191
- traced_encoder_model = torch.jit.trace(encoder_model, [torch.tensor(data[0]) for _, data in encoder_sample_inputs.items()])
192
 
193
  # Compile model on a specific device
194
- encoder_compile_job = hub.submit_compile_job(
195
- model=traced_encoder_model ,
196
  device=device,
197
- input_specs=encoder_model.get_input_spec(),
198
  )
199
 
200
  # Get target model to run on-device
201
- encoder_target_model = encoder_compile_job.get_target_model()
202
  # Trace model
203
- decoder_input_shape = decoder_model.get_input_spec()
204
- decoder_sample_inputs = decoder_model.sample_inputs()
205
 
206
- traced_decoder_model = torch.jit.trace(decoder_model, [torch.tensor(data[0]) for _, data in decoder_sample_inputs.items()])
207
 
208
  # Compile model on a specific device
209
- decoder_compile_job = hub.submit_compile_job(
210
- model=traced_decoder_model ,
211
  device=device,
212
- input_specs=decoder_model.get_input_spec(),
213
  )
214
 
215
  # Get target model to run on-device
216
- decoder_target_model = decoder_compile_job.get_target_model()
217
 
218
  ```
219
 
@@ -225,14 +224,14 @@ After compiling models from step 1. Models can be profiled model on-device using
225
  provisioned in the cloud. Once the job is submitted, you can navigate to a
226
  provided job URL to view a variety of on-device performance metrics.
227
  ```python
228
- encoder_profile_job = hub.submit_profile_job(
229
- model=encoder_target_model,
230
- device=device,
231
- )
232
  decoder_profile_job = hub.submit_profile_job(
233
  model=decoder_target_model,
234
  device=device,
235
  )
 
 
 
 
236
 
237
  ```
238
 
@@ -241,13 +240,6 @@ Step 3: **Verify on-device accuracy**
241
  To verify the accuracy of the model on-device, you can run on-device inference
242
  on sample input data on the same cloud hosted device.
243
  ```python
244
- encoder_input_data = encoder_model.sample_inputs()
245
- encoder_inference_job = hub.submit_inference_job(
246
- model=encoder_target_model,
247
- device=device,
248
- inputs=encoder_input_data,
249
- )
250
- encoder_inference_job.download_output_data()
251
  decoder_input_data = decoder_model.sample_inputs()
252
  decoder_inference_job = hub.submit_inference_job(
253
  model=decoder_target_model,
@@ -255,6 +247,13 @@ decoder_inference_job = hub.submit_inference_job(
255
  inputs=decoder_input_data,
256
  )
257
  decoder_inference_job.download_output_data()
 
 
 
 
 
 
 
258
 
259
  ```
260
  With the output of the model, you can compute like PSNR, relative errors or
 
38
 
39
  | Model | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
40
  |---|---|---|---|---|---|---|---|---|
41
+ | WhisperDecoder | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 | TFLITE | 29.81 ms | 11 - 96 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
42
+ | WhisperDecoder | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 | QNN | 11.818 ms | 59 - 139 MB | FP16 | NPU | [Whisper-Small-En.so](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.so) |
43
+ | WhisperDecoder | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 | TFLITE | 23.422 ms | 16 - 148 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
44
+ | WhisperDecoder | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 | QNN | 9.293 ms | 61 - 168 MB | FP16 | NPU | [Whisper-Small-En.so](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.so) |
45
+ | WhisperDecoder | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | TFLITE | 19.441 ms | 15 - 176 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
46
+ | WhisperDecoder | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | QNN | 7.329 ms | 47 - 181 MB | FP16 | NPU | Use Export Script |
47
+ | WhisperDecoder | QCS8550 (Proxy) | QCS8550 Proxy | TFLITE | 28.585 ms | 16 - 98 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
48
+ | WhisperDecoder | QCS8550 (Proxy) | QCS8550 Proxy | QNN | 12.218 ms | 39 - 40 MB | FP16 | NPU | Use Export Script |
49
+ | WhisperDecoder | SA7255P ADP | SA7255P | QNN | 75.01 ms | 53 - 64 MB | FP16 | NPU | Use Export Script |
50
+ | WhisperDecoder | SA8255 (Proxy) | SA8255P Proxy | TFLITE | 29.634 ms | 16 - 96 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
51
+ | WhisperDecoder | SA8255 (Proxy) | SA8255P Proxy | QNN | 11.938 ms | 61 - 62 MB | FP16 | NPU | Use Export Script |
52
+ | WhisperDecoder | SA8295P ADP | SA8295P | TFLITE | 31.029 ms | 16 - 163 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
53
+ | WhisperDecoder | SA8295P ADP | SA8295P | QNN | 14.525 ms | 57 - 63 MB | FP16 | NPU | Use Export Script |
54
+ | WhisperDecoder | SA8650 (Proxy) | SA8650P Proxy | TFLITE | 28.772 ms | 16 - 98 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
55
+ | WhisperDecoder | SA8650 (Proxy) | SA8650P Proxy | QNN | 12.051 ms | 61 - 62 MB | FP16 | NPU | Use Export Script |
56
+ | WhisperDecoder | SA8775P ADP | SA8775P | TFLITE | 33.281 ms | 16 - 175 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
57
+ | WhisperDecoder | SA8775P ADP | SA8775P | QNN | 14.774 ms | 51 - 61 MB | FP16 | NPU | Use Export Script |
58
+ | WhisperDecoder | QCS8450 (Proxy) | QCS8450 Proxy | TFLITE | 33.302 ms | 16 - 141 MB | FP16 | NPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperDecoder.tflite) |
59
+ | WhisperDecoder | QCS8450 (Proxy) | QCS8450 Proxy | QNN | 16.624 ms | 53 - 171 MB | FP16 | NPU | Use Export Script |
60
+ | WhisperDecoder | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 10.781 ms | 61 - 61 MB | FP16 | NPU | Use Export Script |
61
+ | WhisperEncoder | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 | TFLITE | 700.75 ms | 67 - 440 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
62
+ | WhisperEncoder | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 | QNN | 823.812 ms | 0 - 209 MB | FP16 | NPU | [Whisper-Small-En.so](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.so) |
63
+ | WhisperEncoder | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 | TFLITE | 532.216 ms | 0 - 87 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
64
+ | WhisperEncoder | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 | QNN | 586.956 ms | 0 - 839 MB | FP16 | NPU | [Whisper-Small-En.so](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.so) |
65
+ | WhisperEncoder | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | TFLITE | 552.324 ms | 111 - 139 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
66
+ | WhisperEncoder | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | QNN | 513.813 ms | 0 - 906 MB | FP16 | NPU | Use Export Script |
67
+ | WhisperEncoder | QCS8550 (Proxy) | QCS8550 Proxy | TFLITE | 677.595 ms | 92 - 467 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
68
+ | WhisperEncoder | QCS8550 (Proxy) | QCS8550 Proxy | QNN | 671.038 ms | 1 - 2 MB | FP16 | NPU | Use Export Script |
69
+ | WhisperEncoder | SA7255P ADP | SA7255P | TFLITE | 4432.265 ms | 108 - 142 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
70
+ | WhisperEncoder | SA7255P ADP | SA7255P | QNN | 3212.772 ms | 0 - 10 MB | FP16 | NPU | Use Export Script |
71
+ | WhisperEncoder | SA8255 (Proxy) | SA8255P Proxy | TFLITE | 714.664 ms | 42 - 434 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
72
+ | WhisperEncoder | SA8255 (Proxy) | SA8255P Proxy | QNN | 682.786 ms | 1 - 3 MB | FP16 | NPU | Use Export Script |
73
+ | WhisperEncoder | SA8295P ADP | SA8295P | TFLITE | 657.129 ms | 110 - 142 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
74
+ | WhisperEncoder | SA8295P ADP | SA8295P | QNN | 701.786 ms | 0 - 6 MB | FP16 | NPU | Use Export Script |
75
+ | WhisperEncoder | SA8650 (Proxy) | SA8650P Proxy | TFLITE | 694.432 ms | 110 - 448 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
76
+ | WhisperEncoder | SA8650 (Proxy) | SA8650P Proxy | QNN | 684.938 ms | 1 - 3 MB | FP16 | NPU | Use Export Script |
77
+ | WhisperEncoder | SA8775P ADP | SA8775P | TFLITE | 1290.958 ms | 100 - 132 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
78
+ | WhisperEncoder | SA8775P ADP | SA8775P | QNN | 604.972 ms | 0 - 10 MB | FP16 | NPU | Use Export Script |
79
+ | WhisperEncoder | QCS8450 (Proxy) | QCS8450 Proxy | TFLITE | 973.335 ms | 109 - 207 MB | FP16 | GPU | [Whisper-Small-En.tflite](https://huggingface.co/qualcomm/Whisper-Small-En/blob/main/WhisperEncoder.tflite) |
80
+ | WhisperEncoder | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 504.643 ms | 0 - 0 MB | FP16 | NPU | Use Export Script |
 
81
 
82
 
83
 
 
138
  ```
139
  ```
140
  Profiling Results
 
 
 
 
 
 
 
 
 
141
  ------------------------------------------------------------
142
  WhisperDecoder
143
  Device : Samsung Galaxy S23 (13)
144
  Runtime : TFLITE
145
+ Estimated inference time (ms) : 29.8
146
+ Estimated peak memory usage (MB): [11, 96]
147
  Total # Ops : 2573
148
  Compute Unit(s) : NPU (2573 ops)
149
+
150
+ ------------------------------------------------------------
151
+ WhisperEncoder
152
+ Device : Samsung Galaxy S23 (13)
153
+ Runtime : TFLITE
154
+ Estimated inference time (ms) : 700.8
155
+ Estimated peak memory usage (MB): [67, 440]
156
+ Total # Ops : 911
157
+ Compute Unit(s) : GPU (900 ops) CPU (11 ops)
158
  ```
159
 
160
 
 
177
 
178
  # Load the model
179
  model = Model.from_pretrained()
 
180
  decoder_model = model.decoder
181
+ encoder_model = model.encoder
182
 
183
  # Device
184
  device = hub.Device("Samsung Galaxy S23")
185
 
186
  # Trace model
187
+ decoder_input_shape = decoder_model.get_input_spec()
188
+ decoder_sample_inputs = decoder_model.sample_inputs()
189
 
190
+ traced_decoder_model = torch.jit.trace(decoder_model, [torch.tensor(data[0]) for _, data in decoder_sample_inputs.items()])
191
 
192
  # Compile model on a specific device
193
+ decoder_compile_job = hub.submit_compile_job(
194
+ model=traced_decoder_model ,
195
  device=device,
196
+ input_specs=decoder_model.get_input_spec(),
197
  )
198
 
199
  # Get target model to run on-device
200
+ decoder_target_model = decoder_compile_job.get_target_model()
201
  # Trace model
202
+ encoder_input_shape = encoder_model.get_input_spec()
203
+ encoder_sample_inputs = encoder_model.sample_inputs()
204
 
205
+ traced_encoder_model = torch.jit.trace(encoder_model, [torch.tensor(data[0]) for _, data in encoder_sample_inputs.items()])
206
 
207
  # Compile model on a specific device
208
+ encoder_compile_job = hub.submit_compile_job(
209
+ model=traced_encoder_model ,
210
  device=device,
211
+ input_specs=encoder_model.get_input_spec(),
212
  )
213
 
214
  # Get target model to run on-device
215
+ encoder_target_model = encoder_compile_job.get_target_model()
216
 
217
  ```
218
 
 
224
  provisioned in the cloud. Once the job is submitted, you can navigate to a
225
  provided job URL to view a variety of on-device performance metrics.
226
  ```python
 
 
 
 
227
  decoder_profile_job = hub.submit_profile_job(
228
  model=decoder_target_model,
229
  device=device,
230
  )
231
+ encoder_profile_job = hub.submit_profile_job(
232
+ model=encoder_target_model,
233
+ device=device,
234
+ )
235
 
236
  ```
237
 
 
240
  To verify the accuracy of the model on-device, you can run on-device inference
241
  on sample input data on the same cloud hosted device.
242
  ```python
 
 
 
 
 
 
 
243
  decoder_input_data = decoder_model.sample_inputs()
244
  decoder_inference_job = hub.submit_inference_job(
245
  model=decoder_target_model,
 
247
  inputs=decoder_input_data,
248
  )
249
  decoder_inference_job.download_output_data()
250
+ encoder_input_data = encoder_model.sample_inputs()
251
+ encoder_inference_job = hub.submit_inference_job(
252
+ model=encoder_target_model,
253
+ device=device,
254
+ inputs=encoder_input_data,
255
+ )
256
+ encoder_inference_job.download_output_data()
257
 
258
  ```
259
  With the output of the model, you can compute like PSNR, relative errors or