Cloud OCR SDK Documentation

How to Recognize Receipts with ABBYY Cloud OCR SDK

Receipt recognition is a specific kind of document processing. Not every receipt can be carefully scanned. More often it is necessary to recognize a photo made with a mobile phone. Resulting picture is usually distorted and must be preprocessed. Location of fields with data is not fixed, but depends on the country where the receipt was printed. All these conditions make data capture and recognition more complicated.

Using Cloud OCR SDK, you can recognize an image of a receipt and then extract data from the necessary fields, e.g. the total amount, the type of purchase, payment type, the name of the organization which produced the receipt, etc. With this API, there is no need to know the exact location of the fields, Cloud OCR SDK will find them for you and retrieve the values in the XML format.


Before you start working with ABBYY Cloud OCR SDK you should register on the site. Follow the link for registration.

During registration the login and password for Cloud OCR SDK site will be sent to your email. Also you will create Application ID and Application Password. This information will be necessary to access the processing server, see Authentication.

After registration you can use Cloud OCR SDK for receipt recognition.

Recognizing Receipts

To recognize receipts, use the processReceipt method with recognition parameters suitable for your image:

  1. Specify one or more country where the receipt was printed via the country parameter of the method. Several names of countries should be separated with commas, for example "taiwan,china".
  2. Specify if the image is a photograph or a scanned image via the imageSource parameter. This affects the preprocessing operations which can be performed with the image such as automatic correction of distorted text lines, poor focus and lighting on photos. You may set the parameter to ‘auto’ value. In this case the image source will be detected automatically.
  3. Specify if the skew or the orientation of the image should be automatically detected and corrected.

Call the processReceipt method with the specified parameters. A new processing task will be created on the server. Monitor the task status in a loop using the getTaskStatus method until the task is processed. You can find details on the main processing steps in How to Work with Cloud OCR SDK

The output XML file has the following format:

<?xml version="1.0" encoding="UTF-8"?>
<receipts count="1" xmlns="">
 <receipt currency="USD">
  <vendor confidence="73.71695592" isSuspicious="false">
    <text><![CDATA[175 RANCH DR]]></text>
   <phone confidence="100" isSuspicious="false">
   <purchaseType>General Retail</purchaseType>
   <city confidence="20" isSuspicious="true">
   <zip confidence="63" isSuspicious="true">
     <text>CA 95035</text>
   <administrativeRegion confidence="100" isSuspicious="false">
  <total confidence="67" isSuspicious="true">
    <text>PA 93</text>
  <tax total="false" rate="8.75">
    <text>8.750% 2 01</text>
  <payment type="Undefined" confidence="0" isSuspicious="true">
     <text>PA 93</text>
  <recognizedItems count="3">
   <item index="1">
    <name confidence="0" isSuspicious="true">
     <text>TOY BRD 4LB</text>
    <total confidence="43" isSuspicious="true">
    <recognizedText><![CDATA[0073052151457 TOY BRD 4LB 11.89
F&F Savings 2.10-
(RETURN PRICE 11.89 EA)]]></recognizedText>
    <sku confidence="51" isSuspicious="true">
    <amountUnits confidence="0" isSuspicious="true">Unknown</amountUnits>

The elements and attributes are described in detail in Output XML with Receipt Data. See also the XSD schema of an XML file.

See sample implementation of this procedure in C#.