LLM Chat Windows

Deploy and run LLMs via CLI

A sample CLI app demonstrating how to integrate an on‑device LLM with Qualcomm® Gen AI Inference Extensions (Genie) APIs, optimized for Snapdragon® NPU acceleration.

App ReadMe Discuss on Slack

1. Choose a compatible model

2. Get source code and dependencies

3. Build and run app

Compatible Models

Llama-v2-7B-Chat

Llama-v3-8B-Instruct

Llama-v3.1-8B-Instruct

Llama-v3.2-3B-Instruct

Llama3-TAIDE-LX-8B-Chat-Alpha1

App Information

Operating System

Windows 11+

Language

C++

Runtime

Qualcomm® Gen AI Inference Extensions (Genie)

Use Case

Text Generation

Join us on Slack

Join our growing community of 2000+ developers in the AI Hub Slack Channel building the future of AI

Join Now

By Industry

By Model Maker

New! Run your models on Snapdragon® 8 Elite Gen 5 devices with AI Hub.

Models from G42 now available for purchase on AI Hub

Model Makers

Collaborators

Models from Tech Mahindra now available for purchase on AI Hub

Learn about the collaboration between Amazon SageMaker and AI Hub

Communication

Code

Get help, share stories, and hear announcements on our Slack channel

Visit Qualcomm's organization card on Hugging Face

Get Started

Discover

Read our getting started guide and learn how to use Qualcomm AI Hub

Check out news, training videos, customer stories and more on our Resources page