Qualcomm® AI HubAI Hub

Video-MAE

Sports and human action recognition in videos.

Video MAE (Masked Auto Encoder) is a network for doing video classification that uses the ViT (Vision Transformer) backbone.

Technical Details

Model checkpoint:Kinectics-400
Input resolution:224x224
Number of parameters:87.7M
Model size:335 MB

Applicable Scenarios

  • Camera
  • Action Recognition

Licenses

Source Model:CC-BY-4.0
Deployable Model:AI Model Hub License

Tags

  • backbone

Supported Automotive Devices

  • SA7255P ADP
  • SA8255 (Proxy)
  • SA8295P ADP
  • SA8650 (Proxy)
  • SA8775P ADP

Supported Automotive Chipsets

  • Qualcomm® SA7255P
  • Qualcomm® SA8255P (Proxy)
  • Qualcomm® SA8295P
  • Qualcomm® SA8650P (Proxy)
  • Qualcomm® SA8775P

Related Models

See all models

Looking for more? See models created by industry leaders.

Discover Model Makers