ControlNet
Generating visual arts from text prompt and input guiding image.
On-device, high-resolution image synthesis from text and image prompts. ControlNet guides Stable-diffusion with provided input image to generate accurate images from given input prompt.
TorchScriptQualcomm® AI Engine Direct
8.08ms
Inference Time
0-137MB
Memory Usage
570NPU
Layers
TorchScriptQualcomm® AI Engine Direct
193ms
Inference Time
0-1GB
Memory Usage
5,434NPU
Layers
TorchScriptQualcomm® AI Engine Direct
294ms
Inference Time
0-88MB
Memory Usage
409NPU
Layers
TorchScriptQualcomm® AI Engine Direct
76.9ms
Inference Time
0-533MB
Memory Usage
2,406NPU
Layers
Technical Details
Input:Text prompt and input image as a reference
Conditioning Input:Canny-Edge
QNN-SDK:2.19
Text Encoder Number of parameters:340M
UNet Number of parameters:865M
VAE Decoder Number of parameters:83M
ControlNet Number of parameters:361M
Model size:1.4GB
Applicable Scenarios
- Image Generation
- Image Editing
- Content Creation
Supported Form Factors
- Phone
- Tablet
Licenses
Source Model:APACHE-2.0
Deployable Model:APACHE-2.0
Terms of Use:Qualcomm® Generative AI usage and limitations
Tags
- generative-aiModels capable of generating text, images, or other data using generative models, often in response to prompts.
- quantizedA “quantized” model can run in low or mixed precision, which can substantially reduce inference latency.
Supported Devices
- Samsung Galaxy S23
- Samsung Galaxy S23 Ultra
- Samsung Galaxy S23+
- Samsung Galaxy S24
- Samsung Galaxy S24 Ultra
Supported Chipsets
- Snapdragon® 8 Gen 2 Mobile
- Snapdragon® 8 Gen 3 Mobile