Video-MAE
Sports and human action recognition in videos.
Video MAE (Masked Auto Encoder) is a network for doing video classification that uses the ViT (Vision Transformer) backbone.
Not supported
This model is currently not supported on any All Models chipset.
To see performance metrics for this model on other chipsets, click the button below.
View for other chipsetsTechnical Details
Model checkpoint:Kinectics-400
Input resolution:224x224
Number of parameters:87.7M
Model size (float):335 MB
Applicable Scenarios
- Camera
- Action Recognition
Supported Form Factors
- Phone
- Tablet
License
Model:CC-BY-4.0
Tags
- backbone
Supported Devices
- Dragonwing IQ-9075 EVK
- Dragonwing Q-6690 MTP
- Dragonwing RB3 Gen 2 Vision Kit
- QCS8275 (Proxy)
- QCS8450 (Proxy)
- QCS8550 (Proxy)
- SA7255P ADP
- SA8295P ADP
- SA8775P ADP
- Samsung Galaxy S21
- Samsung Galaxy S21 Ultra
- Samsung Galaxy S22 5G
- Samsung Galaxy S22 Ultra 5G
- Samsung Galaxy S22+ 5G
- Samsung Galaxy S23
- Samsung Galaxy S23 Ultra
- Samsung Galaxy S23+
- Samsung Galaxy S24
- Samsung Galaxy S24 Ultra
- Samsung Galaxy S24+
- Samsung Galaxy S25
- Samsung Galaxy S25 Ultra
- Samsung Galaxy S25+
- Samsung Galaxy Tab S8
- Snapdragon 7 Gen 4 QRD
- Snapdragon 8 Elite Gen 5 QRD
- Snapdragon X Elite CRD
- Snapdragon X Plus 8-Core CRD
- Snapdragon X2 Elite CRD
- Xiaomi 12
- Xiaomi 12 Pro
- XR2 Gen 2 (Proxy)
Supported Chipsets
- Qualcomm® QCM6690
- Qualcomm® QCS6490
- Qualcomm® QCS8275 (Proxy)
- Qualcomm® QCS8550 (Proxy)
- Qualcomm® QCS9075
- Qualcomm® SA7255P
- Qualcomm® SA8295P
- Qualcomm® SA8775P
- Snapdragon® 7 Gen 4 Mobile
- Snapdragon® 8 Elite Mobile
- Snapdragon® 8 Elite Gen 5 Mobile
- Snapdragon® 8 Gen 1 Mobile
- Snapdragon® 8 Gen 2 Mobile
- Snapdragon® 8 Gen 3 Mobile
- Snapdragon® 888 Mobile
- Snapdragon® X Elite
- Snapdragon® X Plus 8-Core
- Snapdragon® X2 Elite
Related Models
See all modelsLooking for more? See models created by industry leaders.
Discover Model Makers









