Regional processing for optical character recognition of plain text on artistic background (image processing and pattern recognition)
Keywords:
OCR, regional processing, binarization, Deskew algorithmAbstract
Optical character recognition (OCR) is still capable of reading text from colored or designed images effectively but with the current technological advancements, OCR has been left behind. On the other hand, regional processing makes it more possible to distinguish which pixels belong to the background and which belongs to the text. Regional processing for OCR would single out letters on the artistic image, put them into words and then - words into sentences, thus enabling to access and edit the content of the original document. The objectives of the study were to take a different approach on OCR, further enhance its accuracy and better analyze text on designed images to expand its limits. Optical character recognition for extracting text on designed images needed a varied way of analysis with the existing algorithms because the same problems might arise if the same approach would be used. This focused more on the preprocessing stage which has been the most common cause of mistakes – when images are not prepared enough for character recognition. The algorithms that were used to efficiently read text on designed images are under machine learning and image processing.
The project focused on the proper utilization of the k-nearest neighbors and Tesseract algorithm with regards to the regions of an image. The overall functioning of OCR contained some steps to recognize the text which include: scanning, preprocessing, feature extraction and classification. Here, the input image to OCR is any hand written or printed texts like books, screenshots and photos with text. Such input is given to OCR initially through scanning - where the analog document is digitized. Then, text regions within image are located, symbols are extracted and preprocessed, and features are extracted and recognized.
Findings revealed that the images and texts come in different styles which required different pre-processing methods. There are many factors that affect the result of OCR in an image and a single algorithm is not enough to solve them all. The study has shown the effectiveness of grouping colors by regions in order to extract the text within an image. First, the program analyses the structure of document image. It divides the page into elements such as blocks of texts, tables, images, etc. The lines are divided into words and then - into characters. Once the characters have been singled out, the program compares them with a set of pattern images. The program analyses different variants of breaking of lines into words and words into characters, presenting the recognized text.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2015 International Research Journal on Innovations in Engineering, Science and Technology
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.