HomeAll ModelsHuggingFace-WavLM-Base-Plus

    HuggingFace-WavLM-Base-Plus

    Real-time Speech processing.

    HuggingFaceWavLMBasePlus is a real time speech processing backbone based on Microsoft's WavLM model.

    TorchScriptTFLite
    789ms
    Inference Time
    142-166MB
    Memory Usage
    811CPU
    Layers

    Technical Details

    Model checkpoint:wavlm-libri-clean-100h-base-plus
    Input resolution:1x320000
    Number of parameters:95.1M
    Model size:363 MB

    Applicable Scenarios

    • Smart Home
    • Accessibility

    Supported Form Factors

    • Phone
    • Tablet
    • IoT

    Licenses

    Source Model:MIT
    Deployable Model:AI Model Hub License

    Tags

    • backbone
      A “backbone” model is designed to extract task-agnostic representations from specific data modalities (e.g., images, text, speech). This representation can then be fine-tuned for specialized tasks.

    Supported Devices

    • Google Pixel 3
    • Google Pixel 3a
    • Google Pixel 3a XL
    • Google Pixel 4
    • Google Pixel 4a
    • Google Pixel 5a 5G
    • QCS8550 (Proxy)
    • Samsung Galaxy S21
    • Samsung Galaxy S21 Ultra
    • Samsung Galaxy S21+
    • Samsung Galaxy S22 5G
    • Samsung Galaxy S22 Ultra 5G
    • Samsung Galaxy S22+ 5G
    • Samsung Galaxy S23
    • Samsung Galaxy S23 Ultra
    • Samsung Galaxy S23+
    • Samsung Galaxy S24
    • Samsung Galaxy S24 Ultra
    • Samsung Galaxy S24+
    • Samsung Galaxy Tab S8
    • Xiaomi 12
    • Xiaomi 12 Pro

    Supported Chipsets

    • Qualcomm® QCS8550
    • Snapdragon® 8 Gen 1 Mobile
    • Snapdragon® 8 Gen 2 Mobile
    • Snapdragon® 8 Gen 3 Mobile
    • Snapdragon® 888 Mobile