Qualcomm® AI HubAI Hub
HomeIoT ModelsStable-Diffusion-v2.1

Stable-Diffusion-v2.1

State‑of‑the‑art generative AI model used to generate detailed images conditioned on text descriptions.

Generates high resolution images from text prompts using a latent diffusion model. This model uses CLIP ViT‑L/14 as text encoder, U‑Net based latent denoising, and VAE based decoder to generate the final image.

TorchScript to Qualcomm® AI Engine Direct
6.64ms
Inference Time
0 ‑ 3MB
Memory Usage
787NPU
Layers

Technical Details

Input:Text prompt to generate image
Text Encoder Number of parameters:340M
UNet Number of parameters:865M
VAE Decoder Number of parameters:83M
Model size:1GB

Applicable Scenarios

  • Image Generation
  • Image Editing
  • Content Creation

Licenses

Tags

  • generative-ai
  • quantized

Supported IoT Devices

  • QCS8275 (Proxy)
  • QCS8550 (Proxy)
  • QCS9075 (Proxy)

Supported IoT Chipsets

  • Qualcomm® QCS8275 (Proxy)
  • Qualcomm® QCS8550 (Proxy)
  • Qualcomm® QCS9075 (Proxy)

Related Models

See all models

Looking for more? See models created by industry leaders.

Discover Model Makers