Let’s say you have a document such as a legal contract or a newspaper article, and you’d like to save it in a digital format. You could spend valuable time typing all the content and then correcting inevitable typos. Or… you could convert the item into a digital format via scanning and using OCR, or Optical Character Recognition, software.

Optical Character Recognition, or OCR, is a technology that enables conversion of different documents types, such as paper files, Adobe PDFs, or even images, into searchable data. A scanner alone is not enough to make this data editable and searchable. A scanner creates an image of a document. In order to extract the data from a scanned document, an image, or a PDF, you need OCR software that identifies alphanumeric characters on the image, and puts them into words.

OCR software recognizes text by analyzing the structure of an image, followed by dividing the page into elements, then dividing the elements into words, and finally, dividing the words into characters. Typically, scanned documents can be saved in a variety of formats such as DOC, RTF, XLS, PDF, HTML, and TXT. They can also be exported to other software applications such as word processing programs or spreadsheet applications.
