Zipformer

Transformer‑based automatic speech recognition (ASR) model for English and Chinese language.

Zipformer streaming ASR (Automatic Speech Recognition) model is a state‑of‑the‑art system designed for transcribing spoken language into written text streamingly. This model is based on the transformer architecture and has been optimized for edge inference by replacing linear layers with convolutional (conv) layers. It exhibits robust performance in realistic, noisy environments, making it highly reliable for real‑world applications. Specifically, it excels in long‑form transcription, capable of accurately transcribing audios. Time to the first token is the encoder's latency, while time to each additional token is joiner's latency, where we assume a max decoded length specified below.

Not supported

This model is currently not supported on any Compute chipset.

To see performance metrics for this model on other chipsets, click the button below.

View for other chipsets

Model Repository Hugging Face Research Paper

Technical Details

Model checkpoint:pfluo/k2fsa-zipformer-chinese-english-mixed

Input resolution:80x71 (0.71 seconds audio)

Max decoded sequence length:200 tokens

Number of parameters (encoder):63.2M

Model size (encoder) (float):242 MB

Number of parameters (decoder):3.47M

Model size (decoder) (float):13.2 MB

Number of parameters (joiner):3.21M

Model size (joiner) (float):12.2 MB

Applicable Scenarios

Smart Home
Accessibility

License

Model:APACHE-2.0

Supported Compute Devices

Snapdragon X Elite CRD
Snapdragon X Plus 8-Core CRD
Snapdragon X2 Elite CRD

Supported Compute Chipsets

Snapdragon® X Elite
Snapdragon® X Plus 8-Core
Snapdragon® X2 Elite

Related Models

See all models

Whisper-Small

Transformer-based automatic speech recognition (ASR) model for multilingual transcription and translation available on HuggingFace.

Looking for more? See models created by industry leaders.

Discover Model Makers

Get Started

Follow Us

Start building with GenieX in a few lines of code

Discover Gen AI models on-device with GenieX

By Industry

Browse 300+ models, optimized and validated by Qualcomm

View all AI Hub models on Hugging Face

Sample Apps by Use Cases

Walk through deploying an AI model on device

Read our getting started guide and learn how to use Qualcomm AI Hub

Unlock On-Device AI

Sign in and access documentation, tools, software, support communities, and more

Optimize AI models for on-device performance with a few lines of code

Learn

Community

Get help, share stories, and hear announcements on our Slack channel