Gemma-4-E4B-it
Multimodal model from Google DeepMind handling text and image input.
Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. This release includes open‑weights models in both pre‑trained and instruction‑tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.
Not supported
This model is currently not supported on any Compute chipset.
To see performance metrics for this model on other chipsets, click the button below.
View for other chipsetsTechnical Details
Model architecture:Mixture-of-Experts (MoE) Transformer with Per-Layer Expert Selection and selective routing.
Supported languages:Multilingual (trained on 140+ languages)
TTFT:Time To First Token is the time it takes to generate the first response token. This is expressed as a range because it varies based on the length of the prompt.
Response Rate:Rate of response generation after the first response token.
Applicable Scenarios
- Dialogue
- Content Generation
License
Model:APACHE-2.0
Terms of Use:Qualcomm® Generative AI usage and limitations
Tags
- llm
- generative-ai
Supported Compute Devices
- Snapdragon X Elite CRD
- Snapdragon X2 Elite CRD
Supported Compute Chipsets
- Snapdragon® X Elite
- Snapdragon® X2 Elite
Related Models
See all modelsLooking for more? See models created by industry leaders.
Discover Model Makers









