Handwriting Recognition using AI Network and Image Processing

4/29/2021

Handwriting character recognition has become a preferred subject of analysis due to increased digital technologies in all sectors. Handwriting character recognition refers to the computer's ability to notice and interpret intelligible handwriting input from touch screens, images, paper documents, and different sources. Handwriting characters stay advanced since different people have different handwriting styles. The use of neural networks to recognise handwriting characters is more efficient and robust than other computing techniques.

There are two basic styles of handwriting recognition systems – online and offline. Each variety is enforced to progressively learn based on the user’s feedback while performing offline learning on data in parallel. Many strategies are used for online and offline handwriting recognition fields, like statistical strategies, structural strategies, neural networks and syntactic strategies. Some recognition systems identify strokes, and others apply recognition on a single character or entire words.

Neural Network-based Handwritten Character Recognition system with feature extraction.
Character Recognition Algorithms

The algorithms used in character recognition can be divided into three categories: Image Pre-processing, Feature Extraction, and Classification. They are normally used in sequence – image pre-processing helps make feature extraction a smoother process, while feature extraction is necessary for correct classification.

Image preprocessing

Image pre-processing is crucial in the recognition pipeline for correct character prediction. These methods typically include noise removal, image segmentation, cropping, scaling, and more. The recognition system first accepts a scanned image as an input. The images can be in JPG or BMT format.
Digital capture and conversion of an image often introduces noise, making it hard to identify a part of the object of interest. Considering the problem of character recognition, we want to reduce as much noise as possible while preserving the characters' strokes since they are important for correct classification.

Segmentation

In the segmentation stage, a sequence of characters is segmented into a sub-image of an individual character. Each character is resized into 30×20 pixels.

Classification and Recognition

This stage is the decision making stage of the recognition system. The classifier contains two hidden layers, using a log sigmoid activation function to train the algorithm.

Feature extraction

The features of input data are the measurable properties of observations used to analyse or classify these instances of data. The task of feature extraction is to identify relevant features that discriminate the instances that are independent of each other.

Neural Network System for Continuous Handwritten Word Recognition

A continuous handwritten word recognition method is derived when the word is segmented into triplets (containing three letters). Two subsequent triplets have two common letters. The biggest challenge for recognition systems is to perform operations on a continuous word. In this, each word is subdivided into triplets, each containing three letters. Two neighbour triplets always contain two common letters, which represent the overlapping between letters. This kind of overlapping results is a higher recognition rate.

Before the intelligent data capture software was available, the only option to digitize printed paper documents was to manually re-type the text. Not only was this massively time consuming, but it also came with typing errors.
Intelligent document processing software is often used as a hidden or silent technology, powering many well-known systems and services in our daily life. It’s used in data entry automation, indexing documents for search engines, automatic number plate recognition, and assisting blind and visually impaired people.

Challenges in Handwriting Recognition

Huge variability and ambiguity of strokes from person to person
The handwriting style of a person also varies from time to time and is inconsistent
Poor quality of the source document/image due to degradation over time
Text in printed documents sit in a straight line, whereas humans need not write a line of text in a straight line on white paper
Cursive handwriting makes separation and recognition of characters challenging
Text in handwriting can have variable rotation to the right, which is in contrast to the printed text, where all the text sits up straight
Collecting a well-labeled dataset to learn is not cheap compared to synthetic data

Use Cases of Handwriting Recognition

Healthcare and pharmaceuticals

Patient prescription digitization is a major pain point in the healthcare/pharmaceutical industry. Another area where handwritten text detection has a key impact is patient enrollment and form digitization. By adding handwriting recognition to their toolkit of services, hospitals/pharmaceuticals can significantly improve user experience.

Insurance

A large insurance industry receives more than 20 million documents a day and a delay in processing the claim can impact the company terribly. The claims document can contain various different handwriting styles and pure manual automation of processing claims is going to completely slow down the pipeline

Banking

People write cheques regularly, and cheques play a major role in most non-cash transactions. In many developing countries, the present cheque processing procedure requires a bank employee to read and manually enter the information present on a cheque and also verify the entries like signature and date. As a large number of cheques have to be processed every day in a bank, a handwriting text recognition system can save costs and hours of human work.

Online Libraries

Huge amounts of historical knowledge are being digitized by uploading the image scans to access the entire world. Handwriting recognition plays a key role in bringing alive the medieval and 20th-century documents, postcards, research studies etc.

Although there have been significant developments in technology that help better recognize handwritten text, Handwritten Text Recognition (HTR) is far from a solved problem than OCR. It hence is not yet extensively employed in the industry. Nevertheless, with the pace of technology evolution and the introduction of models like transformers, we can expect intelligent data capture software models to become commonplace soon.

0 Comments

discovery guide

Handwriting Recognition using AI Network and Image Processing

Leave a Reply.

Christine Wright

Archives

Categories