Set it and forget it OCR

by Ilya Evdokimov | Dec 08, 2009 | OCR

The value of set it and forget it OCR is tremendous

My office is a paper monster, paper comes in and never leaves intact. The scary part is how fast this happens. Paper in hand, review it’s contents and asses it’s value, scan it, shred it. Usually within minuets of it’s existence. The value of set it and forget it OCR is tremendous, but you have to be comfortable.

Set it and forget it OCR is where you take your OCR product and configure it to automatically process any images that appear in a certain folder. For my office, I scan to an “input” folder and all the resulting compressed and OCR’ed PDF files end up in the “File Cabinet” folder. My strategy will not work for the timid as basically I’m relying solely on the power of OCR text and search to retrieve documents when I need them. Most would rather configure their ADF scanner to have a setting or folder for each particular class of documents. Most document scanners anymore have as few as 9 and as many as 99 destinations you can program. You can set each destination as it’s own input folder with it’s own OCR settings with it’s own output folder.

I know I can do this because I know what settings it takes to get the quality of OCR I would need to at least have one or more usable keyword on the document for search. And after-all I’m an expert in OCR so to not use it everyday would be crazy in it’s own right. I’ve yet to be proven wrong, my “File Cabinet” abyss has always giving the information I required at the time I required it and sometimes new information I did not realize I had.

Now for you records management folks shaking your head, I understand your complaint. It should not be about my approach but should be about what I do with the final paper product. For those items for legal or business reasons that are deemed as a record by your taxonomy, they should be filed as such, perhaps scanned again as a record, and for heavens sake if you are not supposed to, don’t destroy it!

The purpose of my madness is to touch paper as little as possible, and get information only when I need it. I am an extremist, but I assure you there is serious value, and a little fun in the set it and forget it OCR technique.

Chris Riley – Sr. Solutions Architect