Application of Machine Learning in Text Recognition : Part II

Abhishek Ghosh

By Abhishek Ghosh August 24, 2018 7:27 am Updated on August 24, 2018

Application of Machine Learning in Text Recognition : Part II

In First Part of Application of Machine Learning in Text Recognition, we have clarified the basic terms from the field of machine learning and text recognition, explained various types of machine learning in brief. In This Second Part of Application of Machine Learning in Text Recognition, we will discuss about types of text recognition and application text recognition. Third part will complete the series.

Application of Machine Learning in Text Recognition : Text Recognition

Types of Text Recognition

Optical Character Recognition (OCR)

Optical Character Recognition (OCR) is a technology which makes it possible to transform human-readable characters into machine-readable code, such as ASCII or Unicode. Most OCR systems are specialized in recognizing machine-generated characters or very clean handwritten characters. The biggest difficulty, however, lies in the recognition of everyday handwriting. In most cases, documents are captured with an optical input device and then read out with an OCR system. If the typeface is typewritten, the hit rate is well over 90%. A text recognition system usually consists of three components: pre-processing, property recognition and classification. In preprocessing, for example scanning documents, raster graphics are created and then edited in such a way that the pure characters are recognizable as well as possible in order to be processed as best as possible. Various technologies are used to achieve this. Binarization maximizes the contrast between the foreground and the background to increase the readability of each pixel. The result is a binary image whose advantage is that the individual pixels can only have the value of black or white and are thus easier to recognize. We can imagine histogram of the original grayscale image. The goal of software is to find the highest possible threshold for the pixels so that all unused pixels are hidden, but at the same time no important details disappear.

Binarization

Subsequently, the readability is further increased by noise reduction. In doing so, single pixels are removed in order to increase the recognizability of the font. Subsequently, in the two-part segmentation, we first separate text and image. Subsequently, texts, columns, paragraphs, words and characters are marked.

In the subsequent skeletonization, the individual characters are made as thin as possible, so that the basic shapes of the individual characters are still recognizable, but limited to a minimum of the width. The goal of skeletonization is to limit the characters to their basic information and thus to simplify recognition.

Skeletonization

After the picture has been prepared as well as possible, the analysis of the characters follows. Here, the characters are recorded as many properties as possible and then subdivided into classes. Among other things, the points of intersection, the recesses and the endpoints of your sign are recorded.

The characters are then compared with previous findings and templates, and the attempt is made to find the best possible match between the analyzed character and the templates.

Application of Machine Learning in Text Recognition Part II

Handwritten Character Recognition (HCR)

Handwritten Character Recognition is a deepening OCR and specializes in recognizing handwritten fonts. The biggest difficulty with this is that the human handwriting depends on a lot of factors and it is not possible to generalize them.

One method of analyzing handwritten characters is to look at their structure. The important thing is to divide the characters into their elementary parts, to grasp them as precisely as possible while at the same time not forgetting the peculiarities of the many different forms of writing a character.

Thus, characters can be subdivided into many different arcs and lines, but different arcs within a character can have the same meaning. Thus, the arcs may differ and still have the same semantics within a character.

In order to get around these differences, it requires an algorithm that is capable of deciding whether or not fitting a sheet or line will change the overall character without losing sight of the possible variations of characters, since there is a lot There are signs that differ only in the smallest possible way.

After optimizing these bends and lines as much as possible, they are analyzed as accurately as possible. Angle, center and radius are considered by the arches and compared with previous findings in order to find the most suitable sign.

Intelligent Character Recognition (ICR)

Intelligent Character Recognition is an extension of the OCR and describes the automatic correction of read-out data based on contextual relationships or previous data sets.

An application example of this is, for example, the application of text recognition in the banking industry. Banks store their customer data all in databases. During readout, the OCR system can access these data sets. For example, various types of queries are sent to the databases, which always consist of a numeric and an alphanumeric value. Since the numeric value is less error-prone, it can be used to find several related data. So you have the possibility of using an account number, directly to the corresponding recipient or the bank code to capture. With this technique, the error rate of readout systems can be minimized, since there are expected values. So there are fields in which one expects only numerical values and from which one can conclude.

Fields of application Text recognition

The areas of application of text recognition are very versatile and, especially in this day and age, in which almost all technologies continue to evolve. Thus, the main areas of application are the automation of processes of all kinds also in relation to digitization. In the case of mailing, in which, for example, the addresses are read out in order to automate the assignment of the letters, the entire postal procedure is accelerated by the text recognition. Where previously sorted by hand, today is sorted out and sorted automatically, which shortens the process many times. It also allows for the tracking of a letter or a parcel, as it always registers where the parcel is currently located.

Not only the mailing, but above all the inbound process of companies can be made much more efficient by text recognition. It is thus possible to scan the incoming mail completely and send it directly to the appropriate employee through text recognition. Thus, a text recognition software can be optimally integrated into a workflow management system. This makes it possible for the company to process corresponding mail faster and also to work more economically and efficiently. In addition, the read-out data can be integrated into a document management system, which makes it possible to identify and find a document based on a wide variety of identifiers.

The recognition of texts also plays a major role in the digitization of archives and analogue data. Thus, after digitizing an archive, it can be digitized and, above all, made searchable by using text recognition to save the time it would take to manually search the archive.

Conclusion

In addition to these classic application examples, there are also countless special solutions. Thus, the everyday life of blind people can be simplified by text recognition. It is thus possible to scan in and read out texts in order to reproduce them acoustically and thus enable the blind person to perceive and understand them. In next and final part of this series, we will discuss about machine Learning as part of text recognition and draw conclusion.

Tagged With text recognition , Application of Machine Learning in Text , text recognition application , application of machine learning in text recognition : part ii , OCR OR (OCR)

About Abhishek Ghosh

Here’s what we’ve got for you which might like :

Take The Conversation Further ...

Get new posts by email: