Qualcomm® AI HubAI Hub

Video-MAE-Quantized

Sports and human action recognition in videos.

Video MAE (Masked Auto Encoder) is a network for doing video classification that uses the ViT (Vision Transformer) backbone.

Technical Details

Model checkpoint:Kinectics-400
Input resolution:224x224
Number of parameters:87.7M
Model size:87.7 MB

Applicable Scenarios

  • Camera
  • Action Recognition

Supported Mobile Form Factors

  • Phone
  • Tablet

Licenses

Source Model:CC-BY-4.0
Deployable Model:AI Model Hub License

Tags

  • backbone
  • quantized

Supported Mobile Devices

  • Samsung Galaxy S21
  • Samsung Galaxy S21 Ultra
  • Samsung Galaxy S21+
  • Samsung Galaxy S22 5G
  • Samsung Galaxy S22 Ultra 5G
  • Samsung Galaxy S22+ 5G
  • Samsung Galaxy S23
  • Samsung Galaxy S23 Ultra
  • Samsung Galaxy S23+
  • Samsung Galaxy S24
  • Samsung Galaxy S24 Ultra
  • Samsung Galaxy S24+
  • Samsung Galaxy Tab S8
  • Snapdragon 8 Elite QRD
  • Xiaomi 12
  • Xiaomi 12 Pro

Supported Mobile Chipsets

  • Snapdragon® 8 Elite Mobile
  • Snapdragon® 8 Gen 1 Mobile
  • Snapdragon® 8 Gen 2 Mobile
  • Snapdragon® 8 Gen 3 Mobile
  • Snapdragon® 888 Mobile

Related Models

See all models

Looking for more? See models created by industry leaders.

Discover Model Makers