About this API

The Invoice OCR API is a tool for extracting structured data from invoice documents. The service uses Optical Character Recognition (OCR) to convert images or PDF files of invoices into machine-readable text. It then identifies and extracts specific, important fields such as invoice numbers, dates, amounts, and supplier details. The primary function of the API is to transform unstructured invoice documents into a structured JSON format, which can then be used for automated data entry into accounting systems, databases, or other business software.

Key Features

Key Field Extraction: Identifies and extracts common invoice fields, including invoice_id, invoice_date, due_date, supplier_name, supplier_address, customer_name, and customer_address.
Amount Recognition: Specifically extracts financial figures, separating them into total_amount, subtotal_amount, and tax_amount.
Full Text Transcription: Provides the complete raw text extracted from the document, allowing for manual verification or searching of the entire document's contents.
Multi-Format Input: Processes invoices from various file formats, including PDF, JPG, and PNG, by accepting the file's raw data.
Structured JSON Output: Delivers all extracted information in a clearly structured JSON format, where each piece of data is paired with a type identifier for easy parsing and integration by software applications.

Use Cases

Scenario 1: Automate Accounts Payable Data Entry

Situation: An accounting department needs to process hundreds of PDF invoices received from suppliers via email each day.
Implementation: An automated script monitors the company's accounts payable inbox. When a new email with an invoice attachment arrives, the script sends the PDF file to the Invoice OCR API. The API returns a JSON object. The script then reads the extracted supplier_name, invoice_id, and total_amount from the JSON to automatically create a new bill record in the company's accounting software, reducing manual data entry.

Scenario 2: Develop an Expense Management Application

Situation: A company is building a mobile app that allows employees to submit expense reports by taking photos of their invoices and receipts.
Implementation: When an employee captures an image of an invoice using the app, the image file is sent to the API. The API processes the image and returns the structured data. The mobile app then uses the extracted supplier_name, invoice_date, and total_amount to pre-fill the fields on the expense claim form, requiring the employee only to confirm the data and select an expense category.

Scenario 3: Integrate Invoice Data with a Procurement System

Situation: A manufacturing company wants to automatically verify that incoming supplier invoices match the purchase orders (POs) issued by their procurement system.
Implementation: As invoices are received, they are digitized and sent to the API. The system extracts the invoice_id and total_amount from the API's JSON response. It then searches its own database to find a PO with a matching number and amount. If the key details correspond, the invoice is automatically flagged as "verified" and queued for payment.

How it Works: Endpoints & Response

The API operates by receiving a document file at a specific endpoint and returning a detailed JSON object containing the extracted data.

Invoice OCR

API Introduction

About this API

Key Features

Use Cases

How it Works: Endpoints & Response

Quick Actions

Pricing

Key Features