240 research outputs found

    Recognition of Cursive Arabic Handwritten Text using Embedded Training based on HMMs

    Get PDF
    In this paper we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models HMMs The system is analytical without explicit segmentation used embedded training to perform and enhance the character models Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image These features are modelled using hidden Markov models and trained by embedded training The experiments on images of the benchmark IFN ENIT database show that the proposed system improves recognitio

    Arabic Text Recognition and Machine Translation

    Full text link
    [EN] Research on Arabic Handwritten Text Recognition (HTR) and Arabic-English Machine Translation (MT) has been usually approached as two independent areas of study. However, the idea of creating one system that combines both areas together, in order to generate English translation out of images containing Arabic text, is still a very challenging task. This process can be interpreted as the translation of Arabic images. In this thesis, we propose a system that recognizes Arabic handwritten text images, and translates the recognized text into English. This system is built from the combination of an HTR system and an MT system. Regarding the HTR system, our work focuses on the use of Bernoulli Hidden Markov Models (BHMMs). BHMMs had proven to work very well with Latin script. Indeed, empirical results based on it were reported on well-known corpora, such as IAM and RIMES. In this thesis, these results are extended to Arabic script, in particular, to the well-known IfN/ENIT and NIST OpenHaRT databases for Arabic handwritten text. The need for transcribing Arabic text is not only limited to handwritten text, but also to printed text. Arabic printed text might be considered as a simple form of handwritten text version. Thus, for this kind of text, we also propose Bernoulli HMMs. In addition, we propose to compare BHMMs with state-of-the-art technology based on neural networks. A key idea that has proven to be very effective in this application of Bernoulli HMMs is the use of a sliding window of adequate width for feature extraction. This idea has allowed us to obtain very competitive results in the recognition of both Arabic handwriting and printed text. Indeed, a system based on it ranked first at the ICDAR 2011 Arabic recognition competition on the Arabic Printed Text Image (APTI) database. Moreover, this idea has been refined by using repositioning techniques for extracted windows, leading to further improvements in Arabic text recognition. In the case of handwritten text, this refinement improved our system which ranked first at the ICFHR 2010 Arabic handwriting recognition competition on IfN/ENIT. In the case of printed text, this refinement led to an improved system which ranked second at the ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text on APTI. Furthermore, this refinement was used with neural networks-based technology, which led to state-of-the-art results. For machine translation, the system was based on the combination of three state-of-the-art statistical models: the standard phrase-based models, the hierarchical phrase-based models, and the N-gram phrase-based models. This combination was done using the Recognizer Output Voting Error Reduction (ROVER) method. Finally, we propose three methods of combining HTR and MT to develop an Arabic image translation system. The system was evaluated on the NIST OpenHaRT database, where competitive results were obtained.[ES] El reconocimiento de texto manuscrito (HTR) en árabe y la traducción automática (MT) del árabe al inglés se han tratado habitualmente como dos áreas de estudio independientes. De hecho, la idea de crear un sistema que combine las dos áreas, que directamente genere texto en inglés a partir de imágenes que contienen texto en árabe, sigue siendo una tarea difícil. Este proceso se puede interpretar como la traducción de imágenes de texto en árabe. En esta tesis, se propone un sistema que reconoce las imágenes de texto manuscrito en árabe, y que traduce el texto reconocido al inglés. Este sistema está construido a partir de la combinación de un sistema HTR y un sistema MT. En cuanto al sistema HTR, nuestro trabajo se enfoca en el uso de los Bernoulli Hidden Markov Models (BHMMs). Los modelos BHMMs ya han sido probados anteriormente en tareas con alfabeto latino obteniendo buenos resultados. De hecho, existen resultados empíricos publicados usando corpus conocidos, tales como IAM o RIMES. En esta tesis, estos resultados se han extendido al texto manuscrito en árabe, en particular, a las bases de datos IfN/ENIT y NIST OpenHaRT. En aplicaciones reales, la transcripción del texto en árabe no se limita únicamente al texto manuscrito, sino también al texto impreso. El texto impreso se puede interpretar como una forma simplificada de texto manuscrito. Por lo tanto, para este tipo de texto, también proponemos el uso de modelos BHMMs. Además, estos modelos se han comparado con tecnología del estado del arte basada en redes neuronales. Una idea clave que ha demostrado ser muy eficaz en la aplicación de modelos BHMMs es el uso de una ventana deslizante (sliding window) de anchura adecuada durante la extracción de características. Esta idea ha permitido obtener resultados muy competitivos tanto en el reconocimiento de texto manuscrito en árabe como en el de texto impreso. De hecho, un sistema basado en este tipo de extracción de características quedó en la primera posición en el concurso ICDAR 2011 Arabic recognition competition usando la base de datos Arabic Printed Text Image (APTI). Además, esta idea se ha perfeccionado mediante el uso de técnicas de reposicionamiento aplicadas a las ventanas extraídas, dando lugar a nuevas mejoras en el reconocimiento de texto árabe. En el caso de texto manuscrito, este refinamiento ha conseguido mejorar el sistema que ocupó el primer lugar en el concurso ICFHR 2010 Arabic handwriting recognition competition usando IfN/ENIT. En el caso del texto impreso, este refinamiento condujo a un sistema mejor que ocupó el segundo lugar en el concurso ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text en el que se usaba APTI. Por otro lado, esta técnica se ha evaluado también en tecnología basada en redes neuronales, lo que ha llevado a resultados del estado del arte. Respecto a la traducción automática, el sistema se ha basado en la combinación de tres tipos de modelos estadísticos del estado del arte: los modelos standard phrase-based, los modelos hierarchical phrase-based y los modelos N-gram phrase-based. Esta combinación se hizo utilizando el método Recognizer Output Voting Error Reduction (ROVER). Por último, se han propuesto tres métodos para combinar los sistemas HTR y MT con el fin de desarrollar un sistema de traducción de imágenes de texto árabe a inglés. El sistema se ha evaluado sobre la base de datos NIST OpenHaRT, donde se han obtenido resultados competitivos.[CA] El reconeixement de text manuscrit (HTR) en àrab i la traducció automàtica (MT) de l'àrab a l'anglès s'han tractat habitualment com dues àrees d'estudi independents. De fet, la idea de crear un sistema que combine les dues àrees, que directament genere text en anglès a partir d'imatges que contenen text en àrab, continua sent una tasca difícil. Aquest procés es pot interpretar com la traducció d'imatges de text en àrab. En aquesta tesi, es proposa un sistema que reconeix les imatges de text manuscrit en àrab, i que tradueix el text reconegut a l'anglès. Aquest sistema està construït a partir de la combinació d'un sistema HTR i d'un sistema MT. Pel que fa al sistema HTR, el nostre treball s'enfoca en l'ús dels Bernoulli Hidden Markov Models (BHMMs). Els models BHMMs ja han estat provats anteriorment en tasques amb alfabet llatí obtenint bons resultats. De fet, existeixen resultats empírics publicats emprant corpus coneguts, tals com IAM o RIMES. En aquesta tesi, aquests resultats s'han estès a la escriptura manuscrita en àrab, en particular, a les bases de dades IfN/ENIT i NIST OpenHaRT. En aplicacions reals, la transcripció de text en àrab no es limita únicament al text manuscrit, sinó també al text imprès. El text imprès es pot interpretar com una forma simplificada de text manuscrit. Per tant, per a aquest tipus de text, també proposem l'ús de models BHMMs. A més a més, aquests models s'han comparat amb tecnologia de l'estat de l'art basada en xarxes neuronals. Una idea clau que ha demostrat ser molt eficaç en l'aplicació de models BHMMs és l'ús d'una finestra lliscant (sliding window) d'amplària adequada durant l'extracció de característiques. Aquesta idea ha permès obtenir resultats molt competitius tant en el reconeixement de text àrab manuscrit com en el de text imprès. De fet, un sistema basat en aquest tipus d'extracció de característiques va quedar en primera posició en el concurs ICDAR 2011 Arabic recognition competition emprant la base de dades Arabic Printed Text Image (APTI). A més a més, aquesta idea s'ha perfeccionat mitjançant l'ús de tècniques de reposicionament aplicades a les finestres extretes, donant lloc a noves millores en el reconeixement de text en àrab. En el cas de text manuscrit, aquest refinament ha aconseguit millorar el sistema que va ocupar el primer lloc en el concurs ICFHR 2010 Arabic handwriting recognition competition usant IfN/ENIT. En el cas del text imprès, aquest refinament va conduir a un sistema millor que va ocupar el segon lloc en el concurs ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text en el qual s'usava APTI. D'altra banda, aquesta tècnica s'ha avaluat també en tecnologia basada en xarxes neuronals, el que ha portat a resultats de l'estat de l'art. Respecte a la traducció automàtica, el sistema s'ha basat en la combinació de tres tipus de models estadístics de l'estat de l'art: els models standard phrase-based, els models hierarchical phrase-based i els models N-gram phrase-based. Aquesta combinació es va fer utilitzant el mètode Recognizer Output Voting Errada Reduction (ROVER). Finalment, s'han proposat tres mètodes per combinar els sistemes HTR i MT amb la finalitat de desenvolupar un sistema de traducció d'imatges de text àrab a anglès. El sistema s'ha avaluat sobre la base de dades NIST OpenHaRT, on s'han obtingut resultats competitius.Alkhoury, I. (2015). Arabic Text Recognition and Machine Translation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/53029TESI

    Window repositioning for Printed Arabic Recognition

    Full text link
    [EN] Bernoulli HMMs are conventional HMMs in which the emission probabilities are modeled with Bernoulli mixtures. They have recently been applied, with good results, in off-line text recognition in many languages, in particular, Arabic. A key idea that has proven to be very effective in this application of Bernoulli HMMs is the use of a sliding window of adequate width for feature extraction. This idea has allowed us to obtain very competitive results in the recognition of both Arabic handwriting and printed text. Indeed, a system based on it ranked first at the ICDAR 2011 Arabic recognition competition on the Arabic Printed Text Image (APTI) database. More recently, this idea has been refined by using repositioning techniques for extracted windows, leading to further improvements in Arabic handwriting recognition. In the case of printed text, this refinement led to an improved system which ranked second at the ICDAR 2013 second competition on APTI, only at a marginal distance from the best system. In this work, we describe the development of this improved system. Following evaluation protocols similar to those of the competitions on APTI, exhaustive experiments are detailed from which state-of-the-art results are obtained.The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/ICT-287755) under grant agreement no. 287755. The research is also supported by the Spanish Government (Plan E, iTrans2 TIN2009-14511 and AECID 2011/2012 grant).Alkhoury, I.; Giménez Pastor, A.; Juan, A.; Andrés Ferrer, J. (2015). Window repositioning for Printed Arabic Recognition. Pattern Recognition Letters. 51:86-93. https://doi.org/10.1016/j.patrec.2014.08.009S86935

    Towards robust real-world historical handwriting recognition

    Get PDF
    In this thesis, we make a bridge from the past to the future by using artificial-intelligence methods for text recognition in a historical Dutch collection of the Natuurkundige Commissie that explored Indonesia (1820-1850). In spite of the successes of systems like 'ChatGPT', reading historical handwriting is still quite challenging for AI. Whereas GPT-like methods work on digital texts, historical manuscripts are only available as an extremely diverse collections of (pixel) images. Despite the great results, current DL methods are very data greedy, time consuming, heavily dependent on the human expert from the humanities for labeling and require machine-learning experts for designing the models. Ideally, the use of deep learning methods should require minimal human effort, have an algorithm observe the evolution of the training process, and avoid inefficient use of the already sparse amount of labeled data. We present several approaches towards dealing with these problems, aiming to improve the robustness of current methods and to improve the autonomy in training. We applied our novel word and line text recognition approaches on nine data sets differing in time period, language, and difficulty: three locally collected historical Latin-based data sets from Naturalis, Leiden; four public Latin-based benchmark data sets for comparability with other approaches; and two Arabic data sets. Using ensemble voting of just five neural networks, a level of accuracy was achieved which required hundreds of neural networks in earlier studies. Moreover, we increased the speed of evaluation of each training epoch without the need of labeled data

    Recognition of handwritten Arabic characters

    Get PDF
    The subject of handwritten character recognition has been receiving considerable attention in recent years due to the increased dependence on computers. Several methods for recognizing Latin, Chinese as well as Kanji characters have been proposed. However, work on recognition of Arabic characters has been relatively sparse. Techniques developed for recognizing characters in other languages can not be used for Arabic since the nature of Arabic characters is different. The shape of a character is a function of its location within a word where each character can have two to four different forms. Most of the techniques proposed to date for recognizing Arabic characters have relied on structural and topographic approaches. This thesis introduces a decision-theoretic approach to solve the problem. The proposed method involves, as a first step, digitization of the segmented character. The secondary part of the character (dots and zigzags) are then isolated and identified separately thereby reducing the recognition issue to a 20 class problem or less for each of the character forms. The moments of the horizontal and vertical projections of the remaining primary characters are calculated and normalized with respect to the zero order moment. Simple measures of shape are obtained from the normalized moments and incorporated into a feature vector. Classification is accomplished using quadratic discriminant functions. The approach was evaluated using isolated, handwritten characters from a data base established for this purpose. The classification rates varied from 97.5% to 100% depending on the form of the characters. These results indicate that the technique offers significantly better classification rates in comparison with existing methods

    Bernoulli HMMs for Handwritten Text Recognition

    Full text link
    In last years Hidden Markov Models (HMMs) have received significant attention in the task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR), HMMs are used to model the probability of an observation sequence, given its corresponding text transcription. However, in contrast to what happens in ASR, in HTR there is no standard set of local features being used by most of the proposed systems. In this thesis we propose the use of raw binary pixels as features, in conjunction with models that deal more directly with the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli (mixture) probability functions. The objective is twofold: on the one hand, this allows us to better modeling the binary nature of text images (foreground/background) using BHMMs. On the other hand, this guarantees that no discriminative information is filtered out during feature extraction (most HTR available datasets can be easily binarized without a relevant loss of information). In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple classifier based on BHMMs with Bernoulli probability functions at the states, and we end with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the binary features, we propose a simple binary feature extraction process without significant loss of information. All input images are scaled and binarized, in order to easily reinterpret them as sequences of binary feature vectors. Two extensions are proposed to this basic feature extraction method: the use of a sliding window in order to better capture the context, and a repositioning method in order to better deal with vertical distortions. Competitive results were obtained when BHMMs and proposed methods were applied to well-known HTR databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition organized during the 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text organized during the 11th International Conference on Document Analysis and Recognition (ICDAR 2011). In the last part of this thesis we propose a method for training BHMM classifiers using In last years Hidden Markov Models (HMMs) have received significant attention in the task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR), HMMs are used to model the probability of an observation sequence, given its corresponding text transcription. However, in contrast to what happens in ASR, in HTR there is no standard set of local features being used by most of the proposed systems. In this thesis we propose the use of raw binary pixels as features, in conjunction with models that deal more directly with the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli (mixture) probability functions. The objective is twofold: on the one hand, this allows us to better modeling the binary nature of text images (foreground/background) using BHMMs. On the other hand, this guarantees that no discriminative information is filtered out during feature extraction (most HTR available datasets can be easily binarized without a relevant loss of information). In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple classifier based on BHMMs with Bernoulli probability functions at the states, and we end with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the binary features, we propose a simple binary feature extraction process without significant loss of information. All input images are scaled and binarized, in order to easily reinterpret them as sequences of binary feature vectors. Two extensions are proposed to this basic feature extraction method: the use of a sliding window in order to better capture the context, and a repositioning method in order to better deal with vertical distortions. Competitive results were obtained when BHMMs and proposed methods were applied to well-known HTR databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition organized during the 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text organized during the 11th International Conference on Document Analysis and Recognition (ICDAR 2011). In the last part of this thesis we propose a method for training BHMM classifiers using In last years Hidden Markov Models (HMMs) have received significant attention in the task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR), HMMs are used to model the probability of an observation sequence, given its corresponding text transcription. However, in contrast to what happens in ASR, in HTR there is no standard set of local features being used by most of the proposed systems. In this thesis we propose the use of raw binary pixels as features, in conjunction with models that deal more directly with the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli (mixture) probability functions. The objective is twofold: on the one hand, this allows us to better modeling the binary nature of text images (foreground/background) using BHMMs. On the other hand, this guarantees that no discriminative information is filtered out during feature extraction (most HTR available datasets can be easily binarized without a relevant loss of information). In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple classifier based on BHMMs with Bernoulli probability functions at the states, and we end with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the binary features, we propose a simple binary feature extraction process without significant loss of information. All input images are scaled and binarized, in order to easily reinterpret them as sequences of binary feature vectors. Two extensions are proposed to this basic feature extraction method: the use of a sliding window in order to better capture the context, and a repositioning method in order to better deal with vertical distortions. Competitive results were obtained when BHMMs and proposed methods were applied to well-known HTR databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition organized during the 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text organized during the 11th International Conference on Document Analysis and Recognition (ICDAR 2011). In the last part of this thesis we propose a method for training BHMM classifiers using discriminative training criteria, instead of the conventionalMaximum Likelihood Estimation (MLE). Specifically, we propose a log-linear classifier for binary data based on the BHMM classifier. Parameter estimation of this model can be carried out using discriminative training criteria for log-linear models. In particular, we show the formulae for several MMI based criteria. Finally, we prove the equivalence between both classifiers, hence, discriminative training of a BHMM classifier can be carried out by obtaining its equivalent log-linear classifier. Reported results show that discriminative BHMMs clearly outperform conventional generative BHMMs.Giménez Pastor, A. (2014). Bernoulli HMMs for Handwritten Text Recognition [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/37978TESI

    Character Recognition

    Get PDF
    Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

    Template Based Recognition of On-Line Handwriting

    Get PDF
    Software for recognition of handwriting has been available for several decades now and research on the subject have produced several different strategies for producing competitive recognition accuracies, especially in the case of isolated single characters. The problem of recognizing samples of handwriting with arbitrary connections between constituent characters (emph{unconstrained handwriting}) adds considerable complexity in form of the segmentation problem. In other words a recognition system, not constrained to the isolated single character case, needs to be able to recognize where in the sample one letter ends and another begins. In the research community and probably also in commercial systems the most common technique for recognizing unconstrained handwriting compromise Neural Networks for partial character matching along with Hidden Markov Modeling for combining partial results to string hypothesis. Neural Networks are often favored by the research community since the recognition functions are more or less automatically inferred from a training set of handwritten samples. From a commercial perspective a downside to this property is the lack of control, since there is no explicit information on the types of samples that can be correctly recognized by the system. In a template based system, each style of writing a particular character is explicitly modeled, and thus provides some intuition regarding the types of errors (confusions) that the system is prone to make. Most template based recognition methods today only work for the isolated single character recognition problem and extensions to unconstrained recognition is usually not straightforward. This thesis presents a step-by-step recipe for producing a template based recognition system which extends naturally to unconstrained handwriting recognition through simple graph techniques. A system based on this construction has been implemented and tested for the difficult case of unconstrained online Arabic handwriting recognition with good results

    Recognition of Cursive Arabic Handwritten Text using Embedded Training based on HMMs

    Get PDF
    In this paper we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models (HMMs). The system is analytical without explicit segmentation used embedded training to perform and enhance the character models. Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image. These features are modelled using hidden Markov models and trained by embedded training. The experiments on images of the benchmark IFN/ENIT database show that the proposed system improves recognition
    corecore