Ok Chris, you talk the talk, but what is it?

by Ilya Evdokimov | Dec 28, 2009 | OCR

OCR is commonly used to encompass all of the recognition technologies out there

The constituents of this blog are varied. Some know what OCR and Data Capture is, some do not. Some know they need it but now necessarily how to use it. Others know how powerful it is and have a good understanding of what is out there, but not the best practices. So taking a step back let me tell you what it’s all about. It’s about saving money, and reducing the cost associated with paper based operations.

OCR is commonly used to encompass all of the recognition technologies out there. It specifically stands for Optical Character Recognition. This is simply the process of taking an image scanned or digitally received and converting from an image to text. While OCR can be used to mean ICR, OCR, Data Capture, OMR, and barcode processing, it is really the process of extracting ALL of the typographic text from an image document and converting it to a digital format. ICR is hand-print extraction, OMR is filled in bubble extraction, and barcode is, well barcode extraction. These later recognition technologies make up Data Capture.

Data capture is the process of extracting field data pairs to be exported in a structured format. It does not have to necessarily get all the information on a document, and is very highly dictated by business processes. Data Capture incorporates ICR, OMR, Barcode, and OCR to extract the data from fields. Fixed From Data Capture are forms that don’t change page to page, and are usually hand-print. Semi-structure forms are 80% of the documents someone sees. Data Capture is usually a more complex technology as compared to just full page OCR.

So there you have it, this is why you are reading this blog to learn about the specifics, nuances, and best practices of these technologies.

Chris Riley – Sr. Solutions Architect