15 research outputs found

    Handwritten OCR for Indic Scripts: A Comprehensive Overview of Machine Learning and Deep Learning Techniques

    Get PDF
    The potential uses of cursive optical character recognition, commonly known as OCR, in a number of industries, particularly document digitization, archiving, even language preservation, have attracted a lot of interest lately. In the framework of optical character recognition (OCR), the goal of this research is to provide a thorough understanding of both cutting-edge methods and the unique difficulties presented by Indic scripts. A thorough literature search was conducted in order to conduct this study, during which time relevant publications, conference proceedings, and scientific files were looked for up to the year 2023. As a consequence of the inclusion criteria that were developed to concentrate on studies only addressing Handwritten OCR on Indic scripts, 53 research publications were chosen as the process's outcome. The review provides a thorough analysis of the methodology and approaches employed in the chosen study. Deep neural networks, conventional feature-based methods, machine learning techniques, and hybrid systems have all been investigated as viable answers to the problem of effectively deciphering Indian scripts, because they are famously challenging to write. To operate, these systems require pre-processing techniques, segmentation schemes, and language models. The outcomes of this methodical examination demonstrate that despite the fact that Hand Scanning for Indic script has advanced significantly, room still exists for advancement. Future research could focus on developing trustworthy models that can handle a range of writing styles and enhance accuracy using less-studied Indic scripts. This profession may advance with the creation of collected datasets and defined standards

    Recognition of compound characters in Kannada language

    Get PDF
    Recognition of degraded printed compound Kannada characters is a challenging research problem. It has been verified experimentally that noise removal is an essential preprocessing step. Proposed are two methods for degraded Kannada character recognition problem. Method 1 is conventionally used histogram of oriented gradients (HOG) feature extraction for character recognition problem. Extracted features are transformed and reduced using principal component analysis (PCA) and classification performed. Various classifiers are experimented with. Simple compound character classification is satisfactory (more than 98% accuracy) with this method. However, the method does not perform well on other two compound types. Method 2 is deep convolutional neural networks (CNN) model for classification. This outperforms HOG features and classification. The highest classification accuracy is found as 98.8% for simple compound character classification. The performance of deep CNN is far better for other two compound types. Deep CNN turns out to better for pooled character classes

    Digit Recognition Using Composite Features With Decision Tree Strategy

    Get PDF
    At present, check transactions are one of the most common forms of money transfer in the market. The information for check exchange is printed using magnetic ink character recognition (MICR), widely used in the banking industry, primarily for processing check transactions. However, the magnetic ink card reader is specialized and expensive, resulting in general accounting departments or bookkeepers using manual data registration instead. An organization that deals with parts or corporate services might have to process 300 to 400 checks each day, which would require a considerable amount of labor to perform the registration process. The cost of a single-sided scanner is only 1/10 of the MICR; hence, using image recognition technology is an economical solution. In this study, we aim to use multiple features for character recognition of E13B, comprising ten numbers and four symbols. For the numeric part, we used statistical features such as image density features, geometric features, and simple decision trees for classification. The symbols of E13B are composed of three distinct rectangles, classified according to their size and relative position. Using the same sample set, MLP, LetNet-5, Alexnet, and hybrid CNN-SVM were used to train the numerical part of the artificial intelligence network as the experimental control group to verify the accuracy and speed of the proposed method. The results of this study were used to verify the performance and usability of the proposed method. Our proposed method obtained all test samples correctly, with a recognition rate close to 100%. A prediction time of less than one millisecond per character, with an average value of 0.03 ms, was achieved, over 50 times faster than state-of-the-art methods. The accuracy rate is also better than all comparative state-of-the-art methods. The proposed method was also applied to an embedded device to ensure the CPU would be used for verification instead of a high-end GPU

    Ensemble learning using multi-objective optimisation for arabic handwritten words

    Get PDF
    Arabic handwriting recognition is a dynamic and stimulating field of study within pattern recognition. This system plays quite a significant part in today's global environment. It is a widespread and computationally costly function due to cursive writing, a massive number of words, and writing style. Based on the literature, the existing features lack data supportive techniques and building geometric features. Most ensemble learning approaches are based on the assumption of linear combination, which is not valid due to differences in data types. Also, the existing approaches of classifier generation do not support decision-making for selecting the most suitable classifier, and it requires enabling multi-objective optimisation to handle these differences in data types. In this thesis, new type of feature for handwriting using Segments Interpolation (SI) to find the best fitting line in each of the windows with a model for finding the best operating point window size for SI features. Multi-Objective Ensemble Oriented (MOEO) formulated to control the classifier topology and provide feedback support for changing the classifiers' topology and weights based on the extension of Non-dominated Sorting Genetic Algorithm (NSGA-II). It is designated as the Random Subset based Parents Selection (RSPS-NSGA-II) to handle neurons and accuracy. Evaluation metrics from two perspectives classification and Multiobjective optimization. The experimental design based on two subsets of the IFN/ENIT database. The first one consists of 10 classes (C10) and 22 classes (C22). The features were tested with Support Vector Machine (SVM) and Extreme Learning Machine (ELM). This work improved due to the SI feature. SI shows a significant result with SVM with 88.53% for C22. RSPS for C10 at k=2 achieved 91% accuracy with fewer neurons than NSGA-II, and for C22 at k=10, accuracy has been increased 81% compared to NSGA-II 78%. Future work may consider introducing more features to the system, applying them to other languages, and integrating it with sequence learning for more accuracy

    Design of an Offline Handwriting Recognition System Tested on the Bangla and Korean Scripts

    Get PDF
    This dissertation presents a flexible and robust offline handwriting recognition system which is tested on the Bangla and Korean scripts. Offline handwriting recognition is one of the most challenging and yet to be solved problems in machine learning. While a few popular scripts (like Latin) have received a lot of attention, many other widely used scripts (like Bangla) have seen very little progress. Features such as connectedness and vowels structured as diacritics make it a challenging script to recognize. A simple and robust design for offline recognition is presented which not only works reliably, but also can be used for almost any alphabetic writing system. The framework has been rigorously tested for Bangla and demonstrated how it can be transformed to apply to other scripts through experiments on the Korean script whose two-dimensional arrangement of characters makes it a challenge to recognize. The base of this design is a character spotting network which detects the location of different script elements (such as characters, diacritics) from an unsegmented word image. A transcript is formed from the detected classes based on their corresponding location information. This is the first reported lexicon-free offline recognition system for Bangla and achieves a Character Recognition Accuracy (CRA) of 94.8%. This is also one of the most flexible architectures ever presented. Recognition of Korean was achieved with a 91.2% CRA. Also, a powerful technique of autonomous tagging was developed which can drastically reduce the effort of preparing a dataset for any script. The combination of the character spotting method and the autonomous tagging brings the entire offline recognition problem very close to a singular solution. Additionally, a database named the Boise State Bangla Handwriting Dataset was developed. This is one of the richest offline datasets currently available for Bangla and this has been made publicly accessible to accelerate the research progress. Many other tools were developed and experiments were conducted to more rigorously validate this framework by evaluating the method against external datasets (CMATERdb 1.1.1, Indic Word Dataset and REID2019: Early Indian Printed Documents). Offline handwriting recognition is an extremely promising technology and the outcome of this research moves the field significantly ahead

    Automatic intrapersonal variability modeling for offline signature augmentation

    Get PDF
    Orientador: Luiz Eduardo Soares de OliveiraCoorientadores: Robert Sabourin e Alceu de Souza Britto Jr..Tese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 19/07/2021Inclui referências: p. 93-102Área de concentração: Ciência da ComputaçãoResumo: Normalmente, em um cenario do mundo real, poucas assinaturas estao disponiveis para treinar um sistema de verificacao automatica de assinaturas (SVAA). Para resolver esse problema, diversas abordagens para a duplicacao de assinaturas estaticas foram propostas ao longo dos anos. Essas abordagens geram novas amostras de assinaturas sinteticas aplicando algumas transformacoes na imagem original da assinatura. Algumas delas geram amostras realistas, especialmente o duplicator. Este metodo utiliza um conjunto de parametros para modelar o comportamento do escritor (variabilidade do escritor) ao assinar. No entanto, esses parametros so empiricamente definidos. Este tipo de abordagem pode ser demorado e pode selecionar parametros que nao descrevem a real variabilidade do escritor. A principal hipotese desse trabalho e que a variabilidade do escritor observada no dominio da imagem tambem pode ser transferido para o dominio de caracteristicas. Portanto, este trabalho propoe um novo metodo para modelar automaticamente a variabilidade do escritor para a posterior duplicacao de assinaturas no dominio de imagem (duplicator) e dominio de caracteristicas (filtro Gaussiano e variacao do metodo de Knop). Este trabalho tambem propoe um novo metodo de duplicacao de assinaturas estaticas, que gera as amostras sinteticas diretamente no dominio de caracteristicas usando um filtro Gaussiano. Alem disso, uma nova abordagem para avaliar a qualidade de amostras sinteticas no dominio de caracteristicas e apresentada. As limitacoes e vantagens de ambas as abordagens de duplicacao de assinaturas tambem sao exploradas. Alem de usar a nova abordagem para avaliar a qualidade das amostras, o desempenho de um SVAA e avaliado usando as amostras e tres bases de assinaturas estaticas bem conhecidas: a GPDS-300, a MCYT-75 e a CEDAR. Para a mais utilizada, GPDS-300, quando o classificador SVM foi treinando com somente uma assinatura genuina por escritor, ele obteve um Equal Error Rate (EER) de 5,71%. Quando o classificador tambem utilizou as amostras sinteticas geradas no dominio de imagem, o EER caiu para 1,08%. Quando o classificador foi treinado com as amostras geradas pelo filtro Gaussiano, o EER caiu para 1,04%.Abstract: Normally, in a real-world scenario, there are few signatures available to train an automatic signature verification system (ASVS). To address this issue, several offline signature duplication approaches have been proposed along the years. These approaches generate a new synthetic signature sample applying some transformations in the original signature image. Some of them generate realistic samples, specially the duplicator. This method uses a set of parameters to model the writer's behavior (writer variability) during the signing act. However, these parameters are empirically defined. This kind of approach can be time consuming and can select parameters that do not describe the real writer variability. The main hypothesis of this work is that the writer variability observed in the image space can be transferred to the feature space as well. Therefore, this work proposes a new method to automatically model the writer variability for further signature duplication in the image (duplicator) and the feature space (Gaussian filter and a variation of Knop's method). This work also proposes a new offline signature duplication method, which directly generates the synthetic samples in the feature space using a Gaussian filter. Furthermore, a new approach to assess the quality of the synthetic samples in the feature space is introduced. The limitations and advantages of both signature augmentation approaches are also explored. Despite using the new approach to assess the quality of the samples, the performance of an ASVS was assessed using them and three well-known offline signature datasets: GPDS-300, MCYT-75, and CEDAR. For the most used one, GPDS-300, when the SVM classifier was trained with only one genuine signature per writer, it achieved an Equal Error Rate (EER) of 5.71%. When the classifier also was trained with the synthetic samples generated in the image space, the EER dropped to 1.08%. When the classifier was trained using the synthetic samples generated by the Gaussian filter, the EER dropped to 1.04%

    Machine learning for ancient languages: a survey

    Get PDF
    Ancient languages preserve the cultures and histories of the past. However, their study is fraught with difficulties, and experts must tackle a range of challenging text-based tasks, from deciphering lost languages to restoring damaged inscriptions, to determining the authorship of works of literature. Technological aids have long supported the study of ancient texts, but in recent years advances in artificial intelligence and machine learning have enabled analyses on a scale and in a detail that are reshaping the field of humanities, similarly to how microscopes and telescopes have contributed to the realm of science. This article aims to provide a comprehensive survey of published research using machine learning for the study of ancient texts written in any language, script, and medium, spanning over three and a half millennia of civilizations around the ancient world. To analyze the relevant literature, we introduce a taxonomy of tasks inspired by the steps involved in the study of ancient documents: digitization, restoration, attribution, linguistic analysis, textual criticism, translation, and decipherment. This work offers three major contributions: first, mapping the interdisciplinary field carved out by the synergy between the humanities and machine learning; second, highlighting how active collaboration between specialists from both fields is key to producing impactful and compelling scholarship; third, highlighting promising directions for future work in this field. Thus, this work promotes and supports the continued collaborative impetus between the humanities and machine learning

    Learning to Read Bushman: Automatic Handwriting Recognition for Bushman Languages

    Get PDF
    The Bleek and Lloyd Collection contains notebooks that document the tradition, language and culture of the Bushman people who lived in South Africa in the late 19th century. Transcriptions of these notebooks would allow for the provision of services such as text-based search and text-to-speech. However, these notebooks are currently only available in the form of digital scans and the manual creation of transcriptions is a costly and time-consuming process. Thus, automatic methods could serve as an alternative approach to creating transcriptions of the text in the notebooks. In order to evaluate the use of automatic methods, a corpus of Bushman texts and their associated transcriptions was created. The creation of this corpus involved: the development of a custom method for encoding the Bushman script, which contains complex diacritics; the creation of a tool for creating and transcribing the texts in the notebooks; and the running of a series of workshops in which the tool was used to create the corpus. The corpus was used to evaluate the use of various techniques for automatically transcribing the texts in the corpus in order to determine which approaches were best suited to the complex Bushman script. These techniques included the use of Support Vector Machines, Artificial Neural Networks and Hidden Markov Models as machine learning algorithms, which were coupled with different descriptive features. The effect of the texts used for training the machine learning algorithms was also investigated as well as the use of a statistical language model. It was found that, for Bushman word recognition, the use of a Support Vector Machine with Histograms of Oriented Gradient features resulted in the best performance and, for Bushman text line recognition, Marti & Bunke features resulted in the best performance when used with Hidden Markov Models. The automatic transcription of the Bushman texts proved to be difficult and the performance of the different recognition systems was largely affected by the complexities of the Bushman script. It was also found that, besides having an influence on determining which techniques may be the most appropriate for automatic handwriting recognition, the texts used in a automatic handwriting recognition system also play a large role in determining whether or not automatic recognition should be attempted at all
    corecore