Video-MAE
Sports and human action recognition in videos.
Video MAE (Masked Auto Encoder) is a network for doing video classification that uses the ViT (Vision Transformer) backbone.
Technical Details
Model checkpoint:Kinectics-400
Input resolution:224x224
Number of parameters:87.7M
Model size:335 MB
Applicable Scenarios
- Camera
- Action Recognition
Licenses
Source Model:CC-BY-4.0
Deployable Model:AI Model Hub License
Tags
- backbone
Supported Automotive Devices
- SA7255P ADP
- SA8255 (Proxy)
- SA8295P ADP
- SA8650 (Proxy)
- SA8775P ADP
Supported Automotive Chipsets
- Qualcomm® SA7255P
- Qualcomm® SA8255P (Proxy)
- Qualcomm® SA8295P
- Qualcomm® SA8650P (Proxy)
- Qualcomm® SA8775P
Related Models
See all modelsLooking for more? See models created by industry leaders.
Discover Model Makers