HomeAll ModelsHuggingFace-WavLM-Base-Plus

HuggingFace-WavLM-Base-Plus

Real-time Speech processing.

HuggingFaceWavLMBasePlus is a real time speech processing backbone based on Microsoft's WavLM model.

TorchScripttoTFLite
852ms
Inference Time
141-175MB
Memory Usage
811CPU
Layers

Technical Details

Model checkpoint:wavlm-libri-clean-100h-base-plus
Input resolution:1x320000
Number of parameters:95.1M
Model size:363 MB

Applicable Scenarios

  • Smart Home
  • Accessibility

Supported Form Factors

  • Phone
  • Tablet
  • IoT

Licenses

Source Model:MIT
Deployable Model:AI Model Hub License

Tags

  • backbone
    A “backbone” model is designed to extract task-agnostic representations from specific data modalities (e.g., images, text, speech). This representation can then be fine-tuned for specialized tasks.

Supported Devices

  • Google Pixel 3
  • Google Pixel 3a
  • Google Pixel 3a XL
  • Google Pixel 4
  • Google Pixel 4a
  • Google Pixel 5a 5G
  • QCS8550 (Proxy)
  • Samsung Galaxy S21
  • Samsung Galaxy S21 Ultra
  • Samsung Galaxy S21+
  • Samsung Galaxy S22 5G
  • Samsung Galaxy S22 Ultra 5G
  • Samsung Galaxy S22+ 5G
  • Samsung Galaxy S23
  • Samsung Galaxy S23 Ultra
  • Samsung Galaxy S23+
  • Samsung Galaxy S24
  • Samsung Galaxy S24 Ultra
  • Samsung Galaxy S24+
  • Samsung Galaxy Tab S8
  • Xiaomi 12
  • Xiaomi 12 Pro

Supported Chipsets

  • Qualcomm® QCS8550
  • Snapdragon® 8 Gen 1 Mobile
  • Snapdragon® 8 Gen 2 Mobile
  • Snapdragon® 8 Gen 3 Mobile
  • Snapdragon® 888 Mobile
  • Snapdragon® X Elite