TrOCR

    Transformer based model for state-of-the-art optical character recognition (OCR) on both printed and handwritten text.

    End-to-end text recognition approach with pre-trained image transformer and text transformer models for both image understanding and wordpiece-level text generation.

    TorchScriptTFLite
    163ms
    Inference Time
    6-312MB
    Memory Usage
    592NPU
    Layers

    Technical Details

    Model checkpoint:trocr-small-stage1
    Input resolution:320x320
    Number of parameters (TrOCREncoder):23.0M
    Model size (TrOCREncoder):87.8 MB
    Number of parameters (TrOCRDecoder):38.3M
    Model size (TrOCRDecoder):146 MB

    Applicable Scenarios

    • Publishing
    • Healthcare
    • Document Management

    Supported Form Factors

    • Phone
    • Tablet

    Licenses

    Source Model:MIT
    Deployable Model:AI Model Hub License

    Supported Devices

    • Google Pixel 3
    • Google Pixel 3a
    • Google Pixel 3a XL
    • Google Pixel 4
    • Google Pixel 4a
    • Google Pixel 5a 5G
    • QCS8550 (Proxy)
    • Samsung Galaxy S21
    • Samsung Galaxy S21 Ultra
    • Samsung Galaxy S21+
    • Samsung Galaxy S22 5G
    • Samsung Galaxy S22 Ultra 5G
    • Samsung Galaxy S22+ 5G
    • Samsung Galaxy S23
    • Samsung Galaxy S23 Ultra
    • Samsung Galaxy S23+
    • Samsung Galaxy S24
    • Samsung Galaxy S24 Ultra
    • Samsung Galaxy S24+
    • Samsung Galaxy Tab S8
    • Xiaomi 12
    • Xiaomi 12 Pro

    Supported Chipsets

    • Qualcomm® QCS8550
    • Snapdragon® 8 Gen 1 Mobile
    • Snapdragon® 8 Gen 2 Mobile
    • Snapdragon® 8 Gen 3 Mobile
    • Snapdragon® 888 Mobile