ControlNet

Generating visual arts from text prompt and input guiding image.

On‑device, high‑resolution image synthesis from text and image prompts. ControlNet guides Stable‑diffusion with provided input image to generate accurate images from given input prompt.

Model Repository Hugging Face Research Paper

Technical Details

Input:Text prompt and input image as a reference

Conditioning Input:Canny-Edge

Text Encoder Number of parameters:340M

UNet Number of parameters:865M

VAE Decoder Number of parameters:83M

ControlNet Number of parameters:361M

Model size:1.4GB

Applicable Scenarios

Image Generation
Image Editing
Content Creation

Supported Form Factors

Phone
Tablet

Licenses

Source Model:APACHE-2.0

Deployable Model:APACHE-2.0

Supported Devices

QCS8550 (Proxy)
Samsung Galaxy S23
Samsung Galaxy S23 Ultra
Samsung Galaxy S23+
Samsung Galaxy S24
Samsung Galaxy S24 Ultra

Supported Chipsets

Qualcomm® QCS8550 (Proxy)
Snapdragon® 8 Gen 2 Mobile
Snapdragon® 8 Gen 3 Mobile

Related Models

See all models

Stable-Diffusion-v1.5

State-of-the-art generative AI model used to generate detailed images conditioned on text descriptions.

Looking for more? See models created by industry leaders.

Discover Model Makers

By Industry

By Model Maker

New! Run your models on Snapdragon® 8 Elite devices with AI Hub.

Models from G42 now available for purchase on AI Hub

Model Makers

Collaborators

Models from Tech Mahindra now available for purchase on AI Hub

Learn about the collaboration between Amazon SageMaker and AI Hub

Communication

Code

Get help, share stories, and hear announcements on our Slack channel

Visit Qualcomm's organization card on Hugging Face

Get Started

Discover

Read our getting started guide and learn how to use Qualcomm AI Hub

Check out news, training videos, customer stories and more on our Resources page