OCR

Optical character recognition, abbreviated as OCR is nothing but the electronic conversion of information available in a printed document into a language that is easily understood by the computer.OCR uses both optical and digital techniques. Optical techniques include use of mirrors, lenses whilst digital techniques make use of scanner and computer algorithms. Now-a-days OCR is implementing Digital Image Processing (dip) as well.

All OCR systems include a scanner for reading the text and software for resolving the pictures. They also contain a special software package to recognize and read the characters. The scanned image (bitmap) is checked for light and dark areas to identify each alphabetic letter and numeric digit. Simultaneous conversion into ASCII is performed.OCR produces best results with clear documents and standard font types. None of the OCR software is 100% precise. The quality of the original plays a very important role in determining its accuracy. And for this purpose, the scanner used should also be kept in mind. The quality of the light arrays will affect the results of the OCR .OCR results vary depending on the value of the text.

Performing OCR in Microsoft Office 2007 is very simple. First save the image with the text in any of the standard formats. After that, insert the image into a new OneNote document, right click it and select ‘Copy Text from Image’. Then just paste this in another new document. It’s as simple as that!

Today, OCR can understand a wide variety of fonts but still have trouble with fonts resembling the human handwriting. Most ancient scripts are found to be cursive in nature and hence are really hard to be deciphered by the machine. Languages that have joint letters make it all the more difficult to comprehend. Thus to make this job less tedious, various OCR resources have been developed; such as typewritten OCR, music OCR cursive OCR, hand print OCR and MICR. However, very small handwritten text, unusual fonts and mathematical formulas don’t work well with OCR.

Thus one of the major advantages of OCR is that it can scan text from paper and transform them into soft copies, which can be stored and used whenever needed. This leads to reduced paper management costs and easy access to valuable information.

Tags: , ,

Leave a comment