Turning off the latest technology

by Ilya Evdokimov | Feb 17, 2010 | OCR

In data capture and OCR, there is a component of the technology called document analysis

Our culture is built on the fact that the newer and more means better. In the advanced technologies that exist, this for the most part is true, but people are always surprised when I tell them that disabling some of the newer technology will actually produce a better result. I am going to give you three examples of where technology demands time travel to older approaches for higher accuracy.

In data capture and OCR, there is a component of the technology called document analysis. Document analysis prior to any collection of data tells the structure of a page including columns, rows, tables, pictures, paragraphs, lines, etc. It’s the biggest contributor to modern day OCR accuracy. Document analysis is really designed for documents that are more traditional such as an article, a book page, or a letter. Document analysis ( although there have been special ones ) does not excel at form type documents. One of the most difficult documents in the world is an Explanation of Benefits EOB. This document has its own structure per variant typically. Surprisingly, the best way to process such a document is to turn off document analysis and default to a basic full-page read of the text. The reason for this is that document analysis provides an overwhelming bias for tables that no EOB will match.

It is the same case when reading text from photographs. When reading text from license-plates and product-plates ( serial number plates welded or stuck to many products ) during assembly it is best done with engines that do not have document analysis. In this case, the document analysis is trying too hard to find information. Because of the nature of these images, what ends up happening is characters in the photo are split into multiple lines and characters. Without document analysis, the engine sees the whole image as one text block and just reads it, thus creating better results. Looking at the license-plate readers that snap pictures of your license plate at toll booths, they are all using older antiquated OCR technology. By turning off document analysis they can use the newer engines.

Finally, there is mobility. This one makes a lot of people uncomfortable. Our society wants to believe their cell phone can do anything. Just today I was wondering why my cell phone did not brush my teeth for me. You can have your cell phone do OCR sure, but it requires older smaller and limited OCR engines to do so. I prefer to send an image to a server and use more advance OCR, but many demand OCR on the phone though in practice it’s usually slower. The reason for this is OCR requires specific processing power, and specific types of processing. Chips in phones today, and likely for a very long time to come will not compete with the power of a computer nor will they, and most importantly, include the proper math operators it takes for efficient and math heavy modern OCR. Cell phones cannot adopt proper chips because we demand long lasting batteries, small size, and low cost. Intense math is simply not important for 99.9% of mobile applications.

There you have it. Modern OCR taken down a few notches to solve current day problems. The best engines that exist today allow you to turn on and off all the various functionality you need thus making it possible to purchase the latest OCR technology and limiting it however you need. Most organizations don’t understand why anyone would want to turn off the new but today I’ve proven new is not always better!

Chris Riley – Sr. Solutions Architect