‚Äč

Cloud OCR SDK Documentation

Output XML Document

The processImage, processDocument methods can return recognized text in XML format (if the exportFormat parameter is set to xml or xmlForCorrectedImage). This format contains recognized text, with structure and parameters which are described with the help of XML.

You can find the description of the main tags of this XML file in the table below. See also the XML scheme of an XML document.

NameDescription
document The root tag. Represents a recognized document. Contains a sequence of page elements. The tag has the following attributes:
  • version — XML version
  • producer — the producer of the XML file
  • languages — (optional) all languages of the document
page Recognized page. It is a sequence of block tags. The tag can have the following attributes:
  • width — the image width in pixels
  • height — the image height in pixels
  • resolution — the image resolution in pixels per inch
  • originalCoords — (optional) if the value is true, all coordinates are relative to the original image before opening (this will be the case if you set exportFormat to xml), if it is false they are relative to the opened (deskewed) image (this will be the case if you set exportFormat to xmlForCorrectedImage
  • rotation — (optional) the type of rotation applied to the original page image. Can have one of the following values: Normal, RotatedClockwise, RotatedUpsidedown, RotatedCounterclockwise (the default value is Normal)
block
(BlockType)

Recognized block. Each such tag includes the region element, which specifies the region of the block on an image.

The tag has the blockType attribute, which denotes the type of the block: Text, Table, Picture, Barcode, Separator, SeparatorsBox. The value of this attribute defines which elements the tag includes:

  • text — available only if blockType attribute is Text
  • row — available only if blockType attribute is Table
  • separatorsBox — available only if blockType attribute is SeparatorsBox
  • separator — available only if blockType attribute is Separator
region Block region, a set of rectangles. Includes one or several rect elements.
rect

Rectangle of a block region.

The tag has the following attributes:

  • l — the coordinate of the left border of the rectangle
  • t — the coordinate of the top border of the rectangle
  • — the coordinate of the right border of the rectangle
  • b — the coordinate of the bottom border of the rectangle
text
(TextType)
Text of a recognized text block or text of a table cell. Contains par elements.

The tag can have the following attributes:

  • orientation — (optional) the text orientation. Can have one of the following values: Normal, RotatedClockwise, RotatedUpsidedown, RotatedCounterclockwise (the default value is Normal)
  • mirrored — (optional) specifies if the text is mirrored (the default value is false)
  • inverted — (optional) specifies if the text is inverted (the default value is false)
par
(ParagraphType)
Paragraph of a recognized text. Contains line elements.

The tag can have the following attributes:

  • dropCapCharsCount — (optional) the number of drop caps in the paragraph (the default value is 0)
  • dropCap-l — (optional) the left coordinate of the drop cap rectangle
  • dropCap-t — (optional) the top coordinate of the drop cap rectangle
  • dropCap-r — (optional) the right coordinate of the drop cap rectangle
  • dropCap-b — (optional) the bottom coordinate of the drop cap rectangle
  • align — (optional) the paragraph aligning. Possesses one of the following values: Left, Center, Right, Justified (the default value is Left)
  • leftIndent — (optional) the left paragraph indent (the default value is 0)
  • rightIndent — (optional) the right paragraph indent (the default value is 0)
  • startIndent — (optional) the indent of the first line of the paragraph (default value is 0)
  • lineSpacing — (optional) the spacing between lines (the default value is 0)
line
(LineType)
Line of a paragraph. Contains formatting elements.

The tag has the following attributes:

  • baseline — the distance from the base line to the top edge of the page
  • l — the coordinate of the left border of the surrounding rectangle,
  • t — the coordinate of the top border of the surrounding rectangle
  • r — the coordinate of the right border of the surrounding rectangle
  • b — the coordinate of the bottom border of the surrounding rectangle
formatting
(FormattingType)
Group of characters with uniform formatting. It is a group of charParams elements.

It has the lang attribute, which specified the name of the language, which has been used for recognition.

charParams
(CharParamsType)
Attributes of a single character. The tag can include charRecVariants element (if the xml:writeRecognitionVariants parameter of a processing method has been set to true).  

The tag can have the following attributes:

  • l — the coordinate of the left border of the character rectangle
  • t — the coordinate of the top border of the character rectangle
  • r — the coordinate of the right border of the character rectangle
  • b — the coordinate of the bottom border of the character rectangle
  • suspicious — (optional) this property set to true means that the character was recognized uncertainly
  • isTab — (optional) this property set to true means that the character is a tab
charRecVariants

Variants of a character recognition (available only if the xml:writeRecognitionVariants parameter of a processing method has been set to true). Contains charRecVariant elements. Has no attributes.

charRecVariant
(CharRecognitionVariant)

A variant of a character recognition (available only if the xml:writeRecognitionVariants parameter of a processing method has been set to true).

The tag can have the following attributes:

  • charConfidence — the estimate of probability that this recognition variant is correct
  • serifProbability — the estimate of probability that this character is written in a Serif font
row
(TableRowType)
Table row (available if blockType attribute is Table). Includes cell elements. Has no attributes.
cell Table cell (available if blockType attribute is Table). It is a a sequence of text tags.

The tag can have the following attributes:

  • colSpan — (optional) column span
  • rowSpan — (optional) row span
  • align — (optional) this property specifies alignment for a tab stop and can have one of the following values: Top, Center, Bottom (the default value is Top)
  • picture — (optional) specifies if the cell contains only a picture (the default value is false)
  • leftBorder — (optional) the table cell left border type. Can have one of the following values: Absent, Unknown, White, Black (the default value is Black)
  • topBorder — (optional) the table cell top border type. Can have one of the following values: Absent, Unknown, White, Black (the default value is Black)
  • rightBorder — (optional) the table cell right border type. Can have one of the following values: Absent, Unknown, White, Black (the default value is Black)
  • bottomBorder — (optional) the table cell bottom border type. Can have one of the following values: Absent, Unknown, White, Black (the default value is Black)
  • width — the width of the cell
  • height — the height of the cell
separatorsBox Group of separators (available if blockType attribute is SeparatorsBox). It is a sequence of separator tags. Has no attributes.
separator
(SeparatorBlockType)
Single separator (available if blockType attribute is Separator) or separator in a group of separators. Includes start and end elements. Has the following attributes:
  • thickness — specifies the precise width of the separator in pixels
  • type — specifies the type of the separator. Can have one of the following values: Unknown, Black, Dotted
start
(Point type)
Start point of a separator. Has the following attributes:
  • x — specifies the horizontal coordinate of the start point of separator
  • y — the vertical coordinate of the start point of separator
end
(Point type)
End point of a separator. Has the following attributes:
  • — specifies the horizontal coordinate of the end point of separator
  • y — the vertical coordinate of the end point of separator