17 research outputs found

    Handwritten OCR for Indic Scripts: A Comprehensive Overview of Machine Learning and Deep Learning Techniques

    Get PDF
    The potential uses of cursive optical character recognition, commonly known as OCR, in a number of industries, particularly document digitization, archiving, even language preservation, have attracted a lot of interest lately. In the framework of optical character recognition (OCR), the goal of this research is to provide a thorough understanding of both cutting-edge methods and the unique difficulties presented by Indic scripts. A thorough literature search was conducted in order to conduct this study, during which time relevant publications, conference proceedings, and scientific files were looked for up to the year 2023. As a consequence of the inclusion criteria that were developed to concentrate on studies only addressing Handwritten OCR on Indic scripts, 53 research publications were chosen as the process's outcome. The review provides a thorough analysis of the methodology and approaches employed in the chosen study. Deep neural networks, conventional feature-based methods, machine learning techniques, and hybrid systems have all been investigated as viable answers to the problem of effectively deciphering Indian scripts, because they are famously challenging to write. To operate, these systems require pre-processing techniques, segmentation schemes, and language models. The outcomes of this methodical examination demonstrate that despite the fact that Hand Scanning for Indic script has advanced significantly, room still exists for advancement. Future research could focus on developing trustworthy models that can handle a range of writing styles and enhance accuracy using less-studied Indic scripts. This profession may advance with the creation of collected datasets and defined standards

    Deep Learning Based Models for Offline Gurmukhi Handwritten Character and Numeral Recognition

    Get PDF
    Over the last few years, several researchers have worked on handwritten character recognition and have proposed various techniques to improve the performance of Indic and non-Indic scripts recognition. Here, a Deep Convolutional Neural Network has been proposed that learns deep features for offline Gurmukhi handwritten character and numeral recognition (HCNR). The proposed network works efficiently for training as well as testing and exhibits a good recognition performance. Two primary datasets comprising of offline handwritten Gurmukhi characters and Gurmukhi numerals have been employed in the present work. The testing accuracies achieved using the proposed network is 98.5% for characters and 98.6% for numerals

    Symbolic and Deep Learning Based Data Representation Methods for Activity Recognition and Image Understanding at Pixel Level

    Get PDF
    Efficient representation of large amount of data particularly images and video helps in the analysis, processing and overall understanding of the data. In this work, we present two frameworks that encapsulate the information present in such data. At first, we present an automated symbolic framework to recognize particular activities in real time from videos. The framework uses regular expressions for symbolically representing (possibly infinite) sets of motion characteristics obtained from a video. It is a uniform framework that handles trajectory-based and periodic articulated activities and provides polynomial time graph algorithms for fast recognition. The regular expressions representing motion characteristics can either be provided manually or learnt automatically from positive and negative examples of strings (that describe dynamic behavior) using offline automata learning frameworks. Confidence measures are associated with recognitions using Levenshtein distance between a string representing a motion signature and the regular expression describing an activity. We have used our framework to recognize trajectory-based activities like vehicle turns (U-turns, left and right turns, and K-turns), vehicle start and stop, person running and walking, and periodic articulated activities like digging, waving, boxing, and clapping in videos from the VIRAT public dataset, the KTH dataset, and a set of videos obtained from YouTube. Next, we present a core sampling framework that is able to use activation maps from several layers of a Convolutional Neural Network (CNN) as features to another neural network using transfer learning to provide an understanding of an input image. The intermediate map responses of a Convolutional Neural Network (CNN) contain information about an image that can be used to extract contextual knowledge about it. Our framework creates a representation that combines features from the test data and the contextual knowledge gained from the responses of a pretrained network, processes it and feeds it to a separate Deep Belief Network. We use this representation to extract more information from an image at the pixel level, hence gaining understanding of the whole image. We experimentally demonstrate the usefulness of our framework using a pretrained VGG-16 model to perform segmentation on the BAERI dataset of Synthetic Aperture Radar (SAR) imagery and the CAMVID dataset. Using this framework, we also reconstruct images by removing noise from noisy character images. The reconstructed images are encoded using Quadtrees. Quadtrees can be an efficient representation in learning from sparse features. When we are dealing with handwritten character images, they are quite susceptible to noise. Hence, preprocessing stages to make the raw data cleaner can improve the efficacy of their use. We improve upon the efficiency of probabilistic quadtrees by using a pixel level classifier to extract the character pixels and remove noise from the images. The pixel level denoiser uses a pretrained CNN trained on a large image dataset and uses transfer learning to aid the reconstruction of characters. In this work, we primarily deal with classification of noisy characters and create the noisy versions of handwritten Bangla Numeral and Basic Character datasets and use them and the Noisy MNIST dataset to demonstrate the usefulness of our approach

    Handwritten Digit Recognition and Classification Using Machine Learning

    Get PDF
    In this paper, multiple learning techniques based on Optical character recognition (OCR) for the handwritten digit recognition are examined, and a new accuracy level for recognition of the MNIST dataset is reported. The proposed framework involves three primary parts, image pre-processing, feature extraction and classification. This study strives to improve the recognition accuracy by more than 99% in handwritten digit recognition. As will be seen, pre-processing and feature extraction play crucial roles in this experiment to reach the highest accuracy

    Design of an Offline Handwriting Recognition System Tested on the Bangla and Korean Scripts

    Get PDF
    This dissertation presents a flexible and robust offline handwriting recognition system which is tested on the Bangla and Korean scripts. Offline handwriting recognition is one of the most challenging and yet to be solved problems in machine learning. While a few popular scripts (like Latin) have received a lot of attention, many other widely used scripts (like Bangla) have seen very little progress. Features such as connectedness and vowels structured as diacritics make it a challenging script to recognize. A simple and robust design for offline recognition is presented which not only works reliably, but also can be used for almost any alphabetic writing system. The framework has been rigorously tested for Bangla and demonstrated how it can be transformed to apply to other scripts through experiments on the Korean script whose two-dimensional arrangement of characters makes it a challenge to recognize. The base of this design is a character spotting network which detects the location of different script elements (such as characters, diacritics) from an unsegmented word image. A transcript is formed from the detected classes based on their corresponding location information. This is the first reported lexicon-free offline recognition system for Bangla and achieves a Character Recognition Accuracy (CRA) of 94.8%. This is also one of the most flexible architectures ever presented. Recognition of Korean was achieved with a 91.2% CRA. Also, a powerful technique of autonomous tagging was developed which can drastically reduce the effort of preparing a dataset for any script. The combination of the character spotting method and the autonomous tagging brings the entire offline recognition problem very close to a singular solution. Additionally, a database named the Boise State Bangla Handwriting Dataset was developed. This is one of the richest offline datasets currently available for Bangla and this has been made publicly accessible to accelerate the research progress. Many other tools were developed and experiments were conducted to more rigorously validate this framework by evaluating the method against external datasets (CMATERdb 1.1.1, Indic Word Dataset and REID2019: Early Indian Printed Documents). Offline handwriting recognition is an extremely promising technology and the outcome of this research moves the field significantly ahead

    Development of Features for Recognition of Handwritten Odia Characters

    Get PDF
    In this thesis, we propose four different schemes for recognition of handwritten atomic Odia characters which includes forty seven alphabets and ten numerals. Odia is the mother tongue of the state of Odisha in the republic of India. Optical character recognition (OCR) for many languages is quite matured and OCR systems are already available in industry standard but, for the Odia language OCR is still a challenging task. Further, the features described for other languages can’t be directly utilized for Odia character recognition for both printed and handwritten text. Thus, the prime thrust has been made to propose features and utilize a classifier to derive a significant recognition accuracy. Due to the non-availability of a handwritten Odia database for validation of the proposed schemes, we have collected samples from individuals to generate a database of large size through a digital note maker. The database consists of a total samples of 17, 100 (150 × 2 × 57) collected from 150 individuals at two different times for 57 characters. This database has been named Odia handwritten character set version 1.0 (OHCS v1.0) and is made available in http://nitrkl.ac.in/Academic/Academic_Centers/Centre_For_Computer_Vision.aspx for the use of researchers. The first scheme divides the contour of each character into thirty segments. Taking the centroid of the character as base point, three primary features length, angle, and chord-to-arc-ratio are extracted from each segment. Thus, there are 30 feature values for each primary attribute and a total of 90 feature points. A back propagation neural network has been employed for the recognition and performance comparisons are made with competent schemes. The second contribution falls in the line of feature reduction of the primary features derived in the earlier contribution. A fuzzy inference system has been employed to generate an aggregated feature vector of size 30 from 90 feature points which represent the most significant features for each character. For recognition, a six-state hidden Markov model (HMM) is employed for each character and as a consequence we have fifty-seven ergodic HMMs with six-states each. An accuracy of 84.5% has been achieved on our dataset. The third contribution involves selection of evidence which are the most informative local shape contour features. A dedicated distance metric namely, far_count is used in computation of the information gain values for possible segments of different lengths that are extracted from whole shape contour of a character. The segment, with highest information gain value is treated as the evidence and mapped to the corresponding class. An evidence dictionary is developed out of these evidence from all classes of characters and is used for testing purpose. An overall testing accuracy rate of 88% is obtained. The final contribution deals with the development of a hybrid feature derived from discrete wavelet transform (DWT) and discrete cosine transform (DCT). Experimentally it has been observed that a 3-level DWT decomposition with 72 DCT coefficients from each high-frequency components as features gives a testing accuracy of 86% in a neural classifier. The suggested features are studied in isolation and extensive simulations has been carried out along with other existing schemes using the same data set. Further, to study generalization behavior of proposed schemes, they are applied on English and Bangla handwritten datasets. The performance parameters like recognition rate and misclassification rate are computed and compared. Further, as we progress from one contribution to the other, the proposed scheme is compared with the earlier proposed schemes

    Detection and recognition of textual information from drug box images using deep learning and computer vision

    Get PDF
    The scope of this thesis work is to implement an OCR pipeline, capable of detecting and recognizing text instances when an image is given as input. The pipeline is divided into two steps: a detector, which scope is to detect the regions where a text is present, and a recognizer, which scope is to recognize and read the detected words and numbers. The work was initially developed during the internship experience in the start-up PatchAI, now an Alira Health company. The application of the algorithm in this context is the recognition of textual information on drug boxes. The idea is to deploy such pipeline into an app support, in such a way it can be used by patients, who can take a picture of the box and receive information about the medicine, in particular its posology. Also the use of a vocal assistant that reads orally the recognized text is explored, being a interesting application for ederly or visually impaired people.The scope of this thesis work is to implement an OCR pipeline, capable of detecting and recognizing text instances when an image is given as input. The pipeline is divided into two steps: a detector, which scope is to detect the regions where a text is present, and a recognizer, which scope is to recognize and read the detected words and numbers. The work was initially developed during the internship experience in the start-up PatchAI, now an Alira Health company. The application of the algorithm in this context is the recognition of textual information on drug boxes. The idea is to deploy such pipeline into an app support, in such a way it can be used by patients, who can take a picture of the box and receive information about the medicine, in particular its posology. Also the use of a vocal assistant that reads orally the recognized text is explored, being a interesting application for ederly or visually impaired people

    Offline signature verification using writer-dependent ensembles and static classifier selection with handcraft features

    Get PDF
    Orientador: Eduardo TodtDissertação (mestrado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 17/02/2022Inclui referências: p. 85-94Área de concentração: Ciência da ComputaçãoResumo: Reconhecimento e identificação de assinaturas em documentos e manuscritos são tarefas desafiadoras que ao longo do tempo vêm sendo estudadas, em especial na questão de discernir assinaturas genuínas de falsificações. Com o recente avanço das tecnologias, principalmente no campo da computação, pesquisas nesta área têm se tornado cada vez mais frequentes, possibilitando o uso de novos métodos de análise das assinaturas, aumentando a precisão e a confiança na verificação delas. Ainda há muito o que se explorar em pesquisas desta área dentro da computação. Verificações de assinaturas consistem, de forma geral, em obter características acerca de um a assinatura e utilizá-las para discerni-la das demais. Estudos propondo variados tipos de métodos foram realizados nos últimos anos a fim de aprimorar os resultados obtidos por sistemas de verificação e identificação de assinaturas. Diferentes formas de extrair características têm sido exploradas, com o o uso de redes neurais artificiais voltadas especificam ente para verificação de assinaturas, como a ResNet e a SigNet, representando o estado-da-arte nesta área de pesquisa. Apesar disso, métodos mais simples de extração de características ainda são muito utilizados, como o histograma de gradientes orientados (HOG), o Local Binary Patterns (LBP) e Local Phase Quantization (LPQ) por exemplo, apresentando, em muitos casos, resultados similares ao estado-da-arte. Não apenas isso, mas diferentes formas de combinar informações de extratores de características e resultados de classificadores têm sido propostos, como é o caso dos seletores de características, métodos de comitê de máquinas e algoritmos de análise da qualidade das características. D esta form a, o trabalho realizado consiste em explorar diferentes métodos de extração de características com binados em um conjunto de classificadores, de maneira que cada conjunto seja construído de forma dependente do autor e seja especificam ente adaptado para reconhecer as melhores características para cada autor, aprendendo quais com binações de classificadores com determinado grupo de características melhor se adaptam para reconhecer suas assinaturas. O desempenho e a funcionalidade do sistema foram comparados com os principais trabalhos da área desenvolvidos nos últimos anos, tendo sido realizados testes com as databases CEDAR, M CYT e UTSig. A pesar de não superar o estado-da-arte, o sistema apresentou bom desempenho, podendo ser com parado com alguns outros trabalhos importantes na área. Além disso, o sistema mostrou a eficiência dos classificadores Support Vector M achine(SVM ) e votadores para a realização da meta-classificação, bem como o potencial de alguns extratores de características para a área de verificação de assinaturas, com o foi o caso do Compound Local Binary Pattern(CLBP).Abstract: Signature recognition and identification in documents and manuscripts are challenging tasks that have been studied over time, especially in the matter of discerning genuine signatures from forgeries. With the recent advancement of technologies, especially in the field of computing, research in this area has become increasingly frequent, enabling the use of new methods of analysis of signatures, increasing accuracy and confidence in their verification. There is still much to be explored in research in this area within computing. Signature verification generally consists in obtaining features about a signature and using them to distinguish it from others. Studies proposing different types o f methods have been carried out in recent years in order to improve the results obtained by signature verification and identification systems. Different ways of extracting features have been explored, such as the use of artificial neural networks specifically aimed at verifying signatures, like ResNet and SigNet, representing the state-of-the-art in this research area. Despite this, simpler methods of feature extraction are still widely used, such as the Histogram of Oriented Gradients (HOG), the Local Binary Patterns (LBP) and the Local Phase Quantization (LPQ) for example, presenting, in many cases, similar results to the state-of-the-art. Not only that, but different ways of combining information from feature extractors and results from classifiers have been proposed, such as feature selectors, machine committee methods and feature quality analysis algorithms. In this way, the developed work consists in exploring different methods of features extractors combined in an ensemble, so that each ensemble is built in a writer-dependent way and is specifically adapted to recognize the best features for each author, learning which combinations of classifiers with a certain group of characteristics is better adapted to recognize their signatures. The performance and functionality of the system were compared w ith the m ain works in the area developed in recent years, w ith tests having been carried out with the CEDAR, M CYT and UTSig databases. Despite not overcoming the state-of-the-art, the system presented good performance, being able to be compared with some other important works in the area. In addition, the system showed the efficiency of Support Vector Machine(SVM ) classifiers and voters to perform the meta-classification, as well as the potential of some feature extractors for the signature verification area, such as the Compound Local Binary Pattern(CLBP)
    corecore