TrOCR

Transformer based model for state-of-the-art optical character recognition (OCR) on both printed and handwritten text.

End-to-end text recognition approach with pre-trained image transformer and text transformer models for both image understanding and wordpiece-level text generation.

111ms
Inference Time
6-333MB
Memory Usage
592NPU
Layers

Technical Details

Model checkpoint:trocr-small-stage1
Input resolution:320x320
Number of parameters (TrOCREncoder):23.0M
Model size (TrOCREncoder):87.8 MB
Number of parameters (TrOCRDecoder):38.3M
Model size (TrOCRDecoder):146 MB

Applicable Scenarios

  • Publishing
  • Healthcare
  • Document Management

Supported Form Factors

  • Phone
  • Tablet

Licenses

Source Model:MIT
Deployable Model:AI Model Hub License

Supported Devices

  • Google Pixel 3
  • Google Pixel 3a
  • Google Pixel 3a XL
  • Google Pixel 4
  • Google Pixel 4a
  • Google Pixel 5a 5G
  • QCS8550 (Proxy)
  • Samsung Galaxy S21
  • Samsung Galaxy S21 Ultra
  • Samsung Galaxy S21+
  • Samsung Galaxy S22 5G
  • Samsung Galaxy S22 Ultra 5G
  • Samsung Galaxy S22+ 5G
  • Samsung Galaxy S23
  • Samsung Galaxy S23 Ultra
  • Samsung Galaxy S23+
  • Samsung Galaxy S24
  • Samsung Galaxy S24 Ultra
  • Samsung Galaxy S24+
  • Samsung Galaxy Tab S8
  • Xiaomi 12
  • Xiaomi 12 Pro

Supported Chipsets

  • Qualcomm® QCS8550
  • Snapdragon® 8 Gen 1 Mobile
  • Snapdragon® 8 Gen 2 Mobile
  • Snapdragon® 8 Gen 3 Mobile
  • Snapdragon® 888 Mobile
  • Snapdragon® X Elite