Stable-Diffusion-v2.1

State‑of‑the‑art generative AI model used to generate detailed images conditioned on text descriptions.

Generates high resolution images from text prompts using a latent diffusion model. This model uses CLIP ViT‑L/14 as text encoder, U‑Net based latent denoising, and VAE based decoder to generate the final image.

Model Repository Hugging Face Research Paper

Technical Details

Input:Text prompt to generate image

Text Encoder Number of parameters:340M

UNet Number of parameters:865M

VAE Decoder Number of parameters:83M

Model size:1GB

Applicable Scenarios

Image Generation
Image Editing
Content Creation

Licenses

Source Model:CREATIVEML-OPENRAIL-M

Deployable Model:CREATIVEML-OPENRAIL-M

Supported IoT Devices

QCS8275 (Proxy)
QCS8550 (Proxy)
QCS9075 (Proxy)

Supported IoT Chipsets

Qualcomm® QCS8275 (Proxy)
Qualcomm® QCS8550 (Proxy)
Qualcomm® QCS9075 (Proxy)

Related Models

See all models

Stable-Diffusion-v1.5

State-of-the-art generative AI model used to generate detailed images conditioned on text descriptions.

Looking for more? See models created by industry leaders.

Discover Model Makers

By Industry

By Model Maker

New! Run your models on Snapdragon® 8 Elite devices with AI Hub.

Models from G42 now available for purchase on AI Hub

Model Makers

Collaborators

Models from Tech Mahindra now available for purchase on AI Hub

Learn about the collaboration between Amazon SageMaker and AI Hub

Communication

Code

Get help, share stories, and hear announcements on our Slack channel

Visit Qualcomm's organization card on Hugging Face

Get Started

Discover

Read our getting started guide and learn how to use Qualcomm AI Hub

Check out news, training videos, customer stories and more on our Resources page