How Does It Work?
For starters, the image is loaded as a bitmap into a device and the most important image features—such as resolution and inversion—are detected. As there are numerous factors that can impact the OCR results, some images require noise cleansing and correction skewing. Others must be rescaled and inverted before processing so they will match specific OCR requirements such as a predefined range of fonts or colors.
The next step is a page layout analysis, also called “zoning”. The algorithm breaks down the page into elements such as blocks of texts, tables, images to later divide it into lines, words and finally, characters.
The actual recognition of the characters is the last step for OCR processes
The actual recognition of the characters is the last step. The algorithm creates numerous hypotheses about each character, taking into consideration different factors such as languages, fonts, print types and so on. As some characters like “1” and “I” or “C” and “G” can look very similar, the dictionary gets the last voice on the doubtful cases.
At last, after processing a huge number of hypotheses, the algorithm finally takes the decision, presenting you the recognized text.
OCR and Mobile
For computers, OCR has been around for quite some time. In fact, one of the first developments of the OCR was via the invention of the optophone in 1913. This instrument—created for blind people by Dr. Edmund Fournier d’Albe—scanned text and generated time-varying chords of tones to identify letters (more about the optophone).
For mobile devices, on the other hand, OCR hasn’t been present for that long. It’s shouldn’t surprise if we consider the fact that just 15 years ago, all that mobile devices could do was handling a simple conversation (or not).
But with the technology moving forward at a ridiculous pace, somewhere on the way smartphones became a basic lifestyle necessity for millions of people around the world. That’s when the breakthrough for the mobile OCR happened.
Still, the text recognition on mobile isn’t something that gets shipped with your brand new smartphone yet. You always need an app with an OCR API to make it work. Luckily, there are many solutions on the market that will help you convert your app into a portable scanner.
Who Is Already Taking the Advantage of OCR?
- Logistics and Transportation
The technology is already widely implemented throughout the entire industrial spectrum, but probably the most obvious application of OCR is in the logistics and transportation. The algorithms are used to speed-up and greatly reduce the costs of processing, tracking, and shipping packages. Instead of typing long tracking numbers, addresses and postal codes, OCR allows distribution employees to simply scan the text to extract the needed information straight from the label, in real-time. With the sharply minimized amount of manual labor, deliveries and shipments can be quickly and conveniently sorted out.
OCR is also getting more and more popular in non-industrial businesses. There is an increasing number of banking mobile apps that use camera-based features to simplify banking processes and make them considerably more efficient. OCR-powered apps also allow the customers to scan different kinds of documents, receipts, and checks, having every piece of data, from the account number to the signature, read, proceed, and stored directly at the bank.
Another industry that is already using OCR (but could be rocking it easily in the future) is healthcare. With its use, data from the handwritten and printed documents like medical reports, prescription forms, and patient records can be automatically extracted and imported into a unified database.
It not only ensures data accuracy but also improves the patient care. It’s worth noticing that OCR use in healthcare sector could go far more than that. Just imagine having your entire medical history in a searchable, digital form, available whenever and wherever you need it. Wouldn’t that be something?