Qualcomm® AI HubAI Hub
Developer Preview

GenieX

Run Any Gen AI Model On Device

NPU
GPU
CPU
Phone
PC
IoT
Car
Star on GitHub8.1K

What is GenieX

An on-device Gen AI inference runtime built for Qualcomm platforms.

Run frontier language and vision-language models locally on Hexagon NPU, Adreno GPU, or CPU with a few lines of code, via the path best for your deployment:

llama.cpp plugin

Broad Coverage

Run community GGUF models directly from Hugging Face via llama.cpp.

QAIRT plugin

Optimal NPU Performance

Run models optimized for NPU via the Qualcomm® AI Runtime (QAIRT).

GenieX CLI
GenieX Python API
GenieX Java API
GenieX Docker
GenieX Server (OpenAI / OpenClaw)
GenieX SDK
GenieX - llama.cpp Plugin
GenieX - QAIRT Plugin
GGML Runtime
GGML CPU Kernels
GGML GPU Kernels
GGML HTP Kernels
Qualcomm® AI Runtime (QAIRT) SDK
CPU
GPU
NPU
Explore Platforms & Runtimes

Shipping AI-Powered Experiences? Talk to Us!