‚Äč

Cloud OCR SDK Documentation

How to Recognize Receipts with ABBYY Cloud OCR SDK

Receipt recognition is a specific kind of document processing. Not every receipt can be carefully scanned. More often it is necessary to recognize a photo made with a mobile phone. Resulting picture is usually distorted and must be preprocessed. Location of fields with data is not fixed, but depends on the country where the receipt was printed. All these conditions make data capturing and recognition more complicated.

Using Cloud OCR SDK, you can recognize an image of a receipt and then extract data from the necessary fields, e.g. the total amount, the type of purchase, payment type, the name of the organization which produced the receipt, etc. With this API, there is no need to know the exact location of the fields, Cloud OCR SDK will find them for you and retrieve the values in the XML format.

Preconditions

Before you start working with ABBYY Cloud OCR SDK you should register on the site. Follow the link for registration.

During registration the login and password for Cloud OCR SDK site will be sent to your email. Also you will create Application ID and Application Password. This information will be necessary to access the processing server, see Authentication.

After registration you can use Cloud OCR SDK for receipt recognition.

Recognizing Receipts

To recognize receipts, use the processReceipt method with recognition parameters suitable for your image:

  1. Specify one or more country where the receipt was printed via the country parameter of the method. Several names of countries should be separated with commas, for example "taiwan,china".
  2. Specify if the image is a photograph or a scanned image via the imageSource parameter. This affects the preprocessing operations which can be performed with the image such as automatic correction of distorted text lines, poor focus and lighting on photos. You may set the parameter to ‘auto’ value. In this case the image source will be detected automatically.
  3. Specify if the skew or the orientation of the image should be automatically detected and corrected.

Call the processReceipt method with the specified parameters. A new processing task will be created on the server. Monitor the task status in a loop using the getTaskStatus method until the task is processed. You can find details on the main processing steps in How to Work with Cloud OCR SDK

The output XML file has the following format:

<?xml version="1.0" encoding="UTF-8"?>
<receipts count="1" xmlns="http://www.abbyy.com/ReceiptCaptureSDK_xml/ReceiptCapture-1.0.xsd">
 <receipt currency="USD">
  <date>
   <normalizedValue>2011-04-30</normalizedValue>
   <recognizedValue>
    <text>04/30/2011</text>
   </recognizedValue>
  </date>
  <total>
   <normalizedValue>54.26</normalizedValue>
   <recognizedValue>
    <text>$54.26</text>
   </recognizedValue>
  </total>
  <payment type="Card" cardType="Undefined">
   <value>
    <normalizedValue>54.26</normalizedValue>
    <recognizedValue>
     <text>$54.26</text>
    </recognizedValue>
   </value>
  </payment>
  <recognizedItems count="24">
   <item index="1">
    <name>
     <text/>
    </name>
    <price>
     <normalizedValue>0.00</normalizedValue>
     <recognizedValue>
      <text/>
     </recognizedValue>
    </price>
    <total>
     <normalizedValue>0.00</normalizedValue>
     <recognizedValue>
      <text/>
     </recognizedValue>
    </total>
    <recognizedText><![CDATA[CLEANING SUPPLIES]]></recognizedText>
   </item>
   ...
  </recognizedItems>
 </receipt>
</receipts>

See the XSD schema of an XML file.

Every recognized field in the XML file is available in two types:

  • normalized value – the data are brought to the standard form and can be used for further calculations;
  • recognized value – the data are exported in a readable form for manipulations with text. I.e. it can be convenient for manual verification if necessary.

See the description of the receipt fields, tag attributes and the components of a line item in the processReceipt method description.

See sample implementation of this procedure in C#.