Profile Job Results
Jobs
jpew8v78p
Results Ready
Name
whisper_large_v3_turbo_HfWhisperDecoder
Target Device
- Snapdragon 8 Elite QRD
- Android 15
- Snapdragon® 8 Elite | SM8750
Creator
ai-hub-support@qti.qualcomm.com
Target Model
Input Specs
input_ids
: int32[1, 1]attention_mask
: float16[1, 1, 1, 200]k_cache_self_0_in
: float16[20, 1, 64, 199]v_cache_self_0_in
: float16[20, 1, 199, 64]k_cache_self_1_in
: float16[20, 1, 64, 199]v_cache_self_1_in
: float16[20, 1, 199, 64]k_cache_self_2_in
: float16[20, 1, 64, 199]v_cache_self_2_in
: float16[20, 1, 199, 64]k_cache_self_3_in
: float16[20, 1, 64, 199]v_cache_self_3_in
: float16[20, 1, 199, 64]k_cache_cross_0
: float16[20, 1, 64, 1500]v_cache_cross_0
: float16[20, 1, 1500, 64]k_cache_cross_1
: float16[20, 1, 64, 1500]v_cache_cross_1
: float16[20, 1, 1500, 64]k_cache_cross_2
: float16[20, 1, 64, 1500]v_cache_cross_2
: float16[20, 1, 1500, 64]k_cache_cross_3
: float16[20, 1, 64, 1500]v_cache_cross_3
: float16[20, 1, 1500, 64]position_ids
: int32[1]Completion Time
8/9/2025, 3:22:13 PM
Versions
- QAIRT: v2.34.2.250528164111_119506
- QNN Backend API: 5.34.0
- QNN Core API: 2.25.0
- Android: 15 (AQ3A.241126.002)
- Build ID: Pakala.LA.1.0.r1-01152-STD.PROD-1
- AI Hub: aihub-2025.07.25.0
Estimated Inference Time
6.99 ms
Estimated Peak Memory Usage
33 ‑ 47 MB
Compute Units
NPU
1222
Stage | Time | Memory |
---|---|---|
First App Load | 485 ms | 1‑7 MB |
Subsequent App Load | 523 ms | 1‑8 MB |
Inference | 6.99 ms | 33‑47 MB |
QNN | Value |
---|---|
context_options.htp_options.performance_mode | BURST |
default_graph_options.htp_options.optimizations[0].type | FINALIZE_OPTIMIZATION_FLAG |
default_graph_options.htp_options.optimizations[0].value | 3.0 |
default_graph_options.htp_options.precision | FLOAT16 |
default_graph_options.htp_options.vtcm_size | 0 |
Sign up to run this model on a hosted Qualcomm® device!
Run on device