Optical Character Recognition, or OCR, is a technology that enables you to convert different types of documents, such as scanned paper documents or PDF files into editable and searchable data.
When a document, say a PDF is loaded into the software it will analyse that document and divide it into different elements such as text blocks, tables, images. From here it will then divide the lines of text into words and then into characters.
From here it applies pattern recognition, to apply logic as to what the extracted text is and will present you with the recognised text.
STP stands for Straight Through Processing
STP Rate = The number of documents that are processed end to end without human intervention.
Most OCR software focuses only on giving accurate results but by combining this technology with RPA it enables documents to be digitised and for that information to be used automatically.
What sorts of documents could you use OCR/RPA on?
There are a wide range of different documents you could use the combination of OCR and RPA to process. Think about documents such as Purchase Orders, Billing Statements, Contracts Claims, Automobile Insurance Claims, Health Insurance Claims and Invoices.
What type of files can be read by OCR software?
JPG OR JPEGPDF (Vector PDF, Raster PDF or Hybrid PDF) PNGTIF OR TIFF.
What if my scanned document is not correctly oriented?
Using its processing logic, the Learning Instance automatically rotates or orients the document to a correct vertical position.
Does OCR support handwritten documents?
It's possible, but as a general rule of thumb we would advise that handwritten documents are not suitable to be extracted by OCR and the same could even be said for cursive fonts.
Can OCR be used to extract data in table format?
Yes, OCR can be used to extract virtually unlimited tables,even where you may have multiple different table types in the same documents, these can be extracted easily.