‚Äč

Cloud OCR SDK Documentation

Export Formats

ABBYY Cloud OCR SDK allows you to export recognized text in the following formats:

FormatexportFormat parameter of the processing methodComments
TXT txt
RTF rtf
DOCX docx
XLSX xlsx
PPTX pptx
PDF pdfSearchable The entire image is saved as a picture, with recognized text put under the image.
pdfTextAndImages The recognized text is saved as text, and the pictures are embedded as images.
PDF/A-1b pdfa The file is saved in PDF/A-1b-compliant format, with the entire image saved as a picture and recognized text put under it.
XML xml

All coordinates are saved relative to the original image.

See Output XML Document for the description of tags. If you select this export format, barcodes are recognized on the image and saved to output XML no matter which profile is used for recognition.

xmlForCorrectedImage

All coordinates are saved relative to the image after geometry correction.

See Output XML Document for the description of tags. If you select this export format, barcodes are recognized on the image and saved to output XML no matter which profile is used for recognition.

ALTO alto
vCard vCard This format is only available with the processBusinessCard method.
CSV csv This format is only available with the processBusinessCard method.

You can use any of these formats except vCard and CSV to export recognized text with the help of the processImage, processDocument methods. These methods also let you specify up to three export formats in one task without any additional costs.

Please note that various field processing methods always return results of recognition in XML format, which contains extended information about the recognized characters.