What is OCR? Print

  • 0

Optical Character Recognition (OCR) is a technology that functions much like a printer in reverse. An OCR system reads printed text and converts it to an electronic format for use in document processing applications. There are a wide variety of OCR systems in use today, from the massive document handling computers used by post offices, to the desktop systems that employ scanners for reading text into word processing and spreadsheet applications.


While they often differ in the combination of technologies employed, all OCR systems have several things in common. They use some form of bitmapped image as an input, whether drawn from a printed document, magnetic tape, or image file. They also employ one or more algorithms (rules or procedures used to solve problems) to translate combinations of dots in a bitmap into a recognized character. Finally, all OCR systems output recognized characters in some kind of computer usable medium, including but not limited to punch cards, electronic data (e.g. point-of-sale scanners in grocery stores) and formatted text.

OCR Advantage

While recognition accuracy is an important part of an OCR product, it is not the only concern. Recognition products are productivity tools - their objective is to make people more productive by reducing the time it takes to translate printed text or image files into editable text. Recognition accuracy is only a part of a total productivity solution. The measure of a truly useful OCR product is not just its recognition ability, but whether, and to what extent, it improves your productivity.

The essential tasks for an OCR product are those that allow you to work most efficiently, i.e., to maximize your throughput. Users of current OCR products know that you can waste a considerable amount of time getting to the point where your electronic document is ready to use. Among the most common time sinks are: manually defining page layout, assigning text, graphic and/or numeric zones, proofing recognition errors and reformatting documents after export. A product that minimizes or eliminates the additional time it takes to perform these tasks is the product that maximizes through put.



Was this answer helpful?

« Back