Stable-Diffusion-v2.1
State‑of‑the‑art generative AI model used to generate detailed images conditioned on text descriptions.
Generates high resolution images from text prompts using a latent diffusion model. This model uses CLIP ViT‑L/14 as text encoder, U‑Net based latent denoising, and VAE based decoder to generate the final image.
TorchScript to Qualcomm® AI Engine Direct
6.64ms
Inference Time
0 ‑ 3MB
Memory Usage
787NPU
Layers
TorchScript to Qualcomm® AI Engine Direct
97.5ms
Inference Time
0 ‑ 3MB
Memory Usage
5,891NPU
Layers
TorchScript to Qualcomm® AI Engine Direct
271ms
Inference Time
0 ‑ 3MB
Memory Usage
189NPU
Layers
Technical Details
Input:Text prompt to generate image
Text Encoder Number of parameters:340M
UNet Number of parameters:865M
VAE Decoder Number of parameters:83M
Model size:1GB
Applicable Scenarios
- Image Generation
- Image Editing
- Content Creation
Licenses
Source Model:CREATIVEML-OPENRAIL-M
Deployable Model:CREATIVEML-OPENRAIL-M
Terms of Use:Qualcomm® Generative AI usage and limitations
Tags
- generative-ai
- quantized
Supported IoT Devices
- QCS8275 (Proxy)
- QCS8550 (Proxy)
- QCS9075 (Proxy)
Supported IoT Chipsets
- Qualcomm® QCS8275 (Proxy)
- Qualcomm® QCS8550 (Proxy)
- Qualcomm® QCS9075 (Proxy)
Related Models
See all modelsLooking for more? See models created by industry leaders.
Discover Model Makers