7 research outputs found

    Consistent relaxation matching for handwritten Chinese character recognition

    Get PDF
    Due to the complexity in structure and the various distortions (translation, rotation, shifting, and deformation) in different writing styles of Handwritten Chinese Characters(HCCs), it is more suitable to use a structural matching algorithm for computer recognition of HCC. Relaxation matching is a powerful technique which can tolerate considerable distortion. However, most relaxation techniques so far developed for Handwritten Chinese Character Recognition (HCCR) are based on a probabilistic relaxation scheme. In this paper, based on local constraint of relaxation labelling and optimization theory, we apply a new relaxation matching technique to handwritten character recognition. From the properties of the compatibility constraints, several rules are devised to guide the design of the compatibility function, which plays an important role in the relaxation process. By parallel use of local contextual information of geometric relaxationship among strokes of two characters, the ambiguity between them can be relaxed iteratively to achieve optimal consistent matching.published_or_final_versio

    Hand Printed Character Recognition Using Neural Networks

    Get PDF
    In this paper an attempt is made to recognize hand-printed characters by using features extracted using the proposed sector approach. In this approach, the normalized and thinned character image is divided into sectors with each sector covering a fixed angle. The features totaling 32 include vector distances, angles, occupancy and end-points. For recognition, both neural networks and fuzzy logic techniques are adopted. The proposed approach is implemented and tested on hand-printed isolated character database consisting of English characters, digits and some of the keyboard special characters

    Improved clustering approach for junction detection of multiple edges with modified freeman chain code

    Get PDF
    Image processing framework of two-dimensional line drawing involves three phases that are detecting junction and corner that exist in the drawing, representing the lines, and extracting features to be used in recognizing the line drawing based on the representation scheme used. As an alternative to the existing frameworks, this thesis proposed a framework that consists of improvement in the clustering approach for junction detection of multiple edges, modified Freeman chain code scheme and provide new features and its extraction, and recognition algorithm. This thesis concerns with problem in clustering line drawing for junction detection of multiple edges in the first phase. Major problems in cluster analysis such as time taken and particularly number of accurate clusters contained in the line drawing when performing junction detection are crucial to be addressed. Two clustering approaches are used to compare with the result obtained from the proposed algorithm: self-organising map (SOM) and affinity propagation (AP). These approaches are chosen based on their similarity as unsupervised learning class and do not require initial cluster count to execute. In the second phase, a new chain code scheme is proposed to be used in representing the direction of lines and it consists of series of directional codes and corner labels found in the drawing. In the third phase, namely feature extraction algorithm, three features proposed are length of lines, angle of corners, and number of branches at each corner. These features are then used in the proposed recognition algorithm to match the line drawing, involving only mean and variance in the calculation. Comparison with SOM and AP clustering approaches resulting in up to 31% reduction for cluster count and 57 times faster. The results on corner detection algorithm shows that it is capable to detect junction and corner of the given thinned binary image by producing a new thinned binary image containing markers at their locations

    Adaptive Algorithms for Automated Processing of Document Images

    Get PDF
    Large scale document digitization projects continue to motivate interesting document understanding technologies such as script and language identification, page classification, segmentation and enhancement. Typically, however, solutions are still limited to narrow domains or regular formats such as books, forms, articles or letters and operate best on clean documents scanned in a controlled environment. More general collections of heterogeneous documents challenge the basic assumptions of state-of-the-art technology regarding quality, script, content and layout. Our work explores the use of adaptive algorithms for the automated analysis of noisy and complex document collections. We first propose, implement and evaluate an adaptive clutter detection and removal technique for complex binary documents. Our distance transform based technique aims to remove irregular and independent unwanted foreground content while leaving text content untouched. The novelty of this approach is in its determination of best approximation to clutter-content boundary with text like structures. Second, we describe a page segmentation technique called Voronoi++ for complex layouts which builds upon the state-of-the-art method proposed by Kise [Kise1999]. Our approach does not assume structured text zones and is designed to handle multi-lingual text in both handwritten and printed form. Voronoi++ is a dynamically adaptive and contextually aware approach that considers components' separation features combined with Docstrum [O'Gorman1993] based angular and neighborhood features to form provisional zone hypotheses. These provisional zones are then verified based on the context built from local separation and high-level content features. Finally, our research proposes a generic model to segment and to recognize characters for any complex syllabic or non-syllabic script, using font-models. This concept is based on the fact that font files contain all the information necessary to render text and thus a model for how to decompose them. Instead of script-specific routines, this work is a step towards a generic character and recognition scheme for both Latin and non-Latin scripts

    Four cornered code based Chinese character recognition system.

    Get PDF
    by Tham Yiu-Man.Thesis (M.Phil.)--Chinese University of Hong Kong, 1993.Includes bibliographical references.Abstract --- p.iAcknowledgements --- p.iiiTable of Contents --- p.ivChapter Chapter I --- IntroductionChapter 1.1 --- Introduction --- p.1-1Chapter 1.2 --- Survey on Chinese Character Recognition --- p.1-4Chapter 1.3 --- Methodology Adopts in Our System --- p.1-7Chapter 1.4 --- Contributions and Organization of the Thesis --- p.1-11Chapter Chapter II --- Pre-processing and Stroke ExtractionChapter 2.1 --- Introduction --- p.2-1Chapter 2.2 --- Thinning --- p.2-1Chapter 2.2.1 --- Introduction to Thinning --- p.2-1Chapter 2.2.2 --- Proposed Thinning Algorithm Cater for Stroke Extraction --- p.2-6Chapter 2.2.3 --- Thinning Results --- p.2-9Chapter 2.3 --- Stroke Extraction --- p.2-13Chapter 2.3.1 --- Introduction to Stroke Extraction --- p.2-13Chapter 2.3.2 --- Proposed Stroke Extraction Method --- p.2-14Chapter 2.3.2.1 --- Fork point detection --- p.2-16Chapter 2.3.2.2 --- 8-connected fork point merging --- p.2-18Chapter 2.3.2.3 --- Sub-stroke extraction --- p.2-18Chapter 2.3.2.4 --- Fork point merging --- p.2-19Chapter 2.3.2.5 --- Sub-stroke connection --- p.2-24Chapter 2.3.3 --- Stroke Extraction Accuracy --- p.2-27Chapter 2.3.4 --- Corner Detection --- p.2-29Chapter 2.3.4.1 --- Introduction to Corner Detection --- p.2-29Chapter 2.3.4.2 --- Proposed Corner Detection Formulation --- p.2-30Chapter 2.4 --- Concluding Remarks --- p.2-33Chapter Chapter III --- Four Corner CodeChapter 3.1 --- Introduction --- p.3-1Chapter 3.2 --- Deletion of Hook Strokes --- p.3-3Chapter 3.3 --- Stroke Types Selection --- p.3-5Chapter 3.4 --- Probability Formulations of Stroke Types --- p.3-7Chapter 3.4.1 --- Simple Strokes --- p.3-7Chapter 3.4.2 --- Square --- p.3-8Chapter 3.4.3 --- Cross --- p.3-10Chapter 3.4.4 --- Upper Right Corner --- p.3-12Chapter 3.4.5 --- Lower Left Corner --- p.3-12Chapter 3.5 --- Corner Segments Extraction Procedure --- p.3-14Chapter 3.5.1 --- Corner Segment Probability --- p.3-21Chapter 3.5.2 --- Corner Segment Extraction --- p.3-23Chapter 3.6 4 --- C Codes Generation --- p.3-26Chapter 3.7 --- Parameters Determination --- p.3-29Chapter 3.8 --- Sensitivity Test --- p.3-31Chapter 3.9 --- Classification Rate --- p.3-32Chapter 3.10 --- Feedback by Corner Segments --- p.3-34Chapter 3.11 --- Classification Rate with Feedback by Corner Segment --- p.3-37Chapter 3.12 --- Reasons for Mis-classification --- p.3-38Chapter 3.13 --- Suggested Solution to the Mis-interpretation of Stroke Type --- p.3-41Chapter 3.14 --- Reduce Size of Candidate Set by No.of Input Segments --- p.3-43Chapter 3.15 --- Extension to Higher Order Code --- p.3-45Chapter 3.16 --- Concluding Remarks --- p.3-46Chapter Chapter IV --- RelaxationChapter 4.1 --- Introduction --- p.4-1Chapter 4.1.1 --- Introduction to Relaxation --- p.4-1Chapter 4.1.2 --- Formulation of Relaxation --- p.4-2Chapter 4.1.3 --- Survey on Chinese Character Recognition by using Relaxation --- p.4-5Chapter 4.2 --- Relaxation Formulations --- p.4-9Chapter 4.2.1 --- Definition of Neighbour Segments --- p.4-9Chapter 4.2.2 --- Formulation of Initial Probability Assignment --- p.4-12Chapter 4.2.3 --- Formulation of Compatibility Function --- p.4-14Chapter 4.2.4 --- Formulation of Support from Neighbours --- p.4-16Chapter 4.2.5 --- Stopping Criteria --- p.4-17Chapter 4.2.6 --- Distance Measures --- p.4-17Chapter 4.2.7 --- Parameters Determination --- p.4-21Chapter 4.3 --- Recognition Rate --- p.4-23Chapter 4.4 --- Reasons for Mis-recognition in Relaxation --- p.4-27Chapter 4.5 --- Introduction of No-label Class --- p.4-31Chapter 4.5.1 --- No-label Initial Probability --- p.4-31Chapter 4.5.2 --- No-label Compatibility Function --- p.4-32Chapter 4.5.3 --- Improvement by No-label Class --- p.4-33Chapter 4.6 --- Rate of Convergence --- p.4-35Chapter 4.6.1 --- Updating Formulae in Exponential Form --- p.4-38Chapter 4.7 --- Comparison with Yamamoto et al's Relaxation Method --- p.4-40Chapter 4.7.1 --- Formulations in Yamamoto et al's Relaxation Method --- p.4-40Chapter 4.7.2 --- Modifications in [YAMAM82] --- p.4-42Chapter 4.7.3 --- Performance Comparison with [YAMAM82] --- p.4-43Chapter 4.8 --- System Overall Recognition Rate --- p.4-45Chapter 4.9 --- Concluding Remarks --- p.4-48Chapter Chapter V --- Concluding RemarksChapter 5.1 --- Recapitulation and Conclusions --- p.5-1Chapter 5.2 --- Limitations in the System --- p.5-4Chapter 5.3 --- Suggestions for Further Developments --- p.5-6References --- p.R-1Appendix User's GuideChapter A .l --- System Functions --- p.A-1Chapter A.2 --- Platform and Compiler --- p.A-1Chapter A.3 --- File List --- p.A-2Chapter A.4 --- Directory --- p.A-3Chapter A.5 --- Description of Sub-routines --- p.A-3Chapter A.6 --- Data Structures and Header Files --- p.A-12Chapter A.7 --- Character File charfile Structure --- p.A-15Chapter A.8 --- Suggested Program to Implement the System --- p.A-1

    Reconnaissance de l’écriture manuscrite avec des réseaux récurrents

    Get PDF
    Mass digitization of paper documents requires highly efficient optical cha-racter recognition systems. Digital versions of paper documents enable the useof search engines through keyword dectection or the extraction of high levelinformation (e.g. : titles, author, dates). Unfortunately writing recognition sys-tems and especially handwriting recognition systems are still far from havingsimilar performance to that of a human being on the most difficult documents.This industrial PhD (CIFRE) between Airbus DS and the LITIS, that tookplace within the MAURDOR project time frame, aims to seek out and improvethe state of the art systems for handwriting recognition.We compare different systems for handwriting recognition. Our compa-risons include various feature sets as well as various dynamic classifiers : i)Hidden Markov Models, ii) hybrid neural network/HMM, iii) hybrid recurrentnetwork Bidirectional Long Short Term Memory - Connectionist TemporalClassification (BLSTM-CTC)/MMC, iv) a hybrid Conditional Random Fields(CRF)/HMM. We compared these results within the framework of the WR2task of the ICDAR 2009 competition, namely a word recognition task usinga 1600 word lexicon. Our results rank the BLSTM-CTC/HMM system as themost performant, as well as clearly showing that BLSTM-CTCs trained ondifferent features are complementary.Our second contribution aims at using this complementary. We explorevarious combination strategies that take place at different levels of the BLSTM-CTC architecture : low level (early fusion), mid level (within the network),high level (late integration). Here again we measure the performances of theWR2 task of the ICDAR 2009 competition. Overall our results show thatour different combination strategies improve on the single feature systems,moreover our best combination results are close to that of the state of theart system on the same task. On top of that we have observed that some ofour combinations are more adapted for systems using a lexicon to correct amistake, while other are better suited for systems with no lexicon.Our third contribution is focused on tasks related to handwriting recognition. We present two systems, one designed for language recognition, theother one for keyword detection, either from a text query or an image query.For these two tasks our systems stand out from the literature since they usea handwriting recognition step. Indeed most literature systems focus on extracting image features for classification or comparison, wich does not seemappropriate given the tasks. Our systems use a handwriting recognition stepfollowed either by a language detection step or a word detection step, depending on the application.La numérisation massive de documents papier a fait apparaître le besoin d’avoir des systèmes de reconnaissance de l’écriture extrêmement performants. La numérisation de ces documents permet d’effectuer des opérations telles que des recherches de mots clefs ou l’extraction d’informations de haut niveau (titre, auteur, adresses, et.). Cependant la reconnaissance de l’écriture et en particulier l’écriture manuscrite ne sont pas encore au niveau de performance de l’homme sur des documents complexes, ce qui restreint ou nuit à certaines applications. Cette thèse CIFRE entre Airbus DS et le LITIS, dans le cadre du projet MAURDOR, a pour but de mettre en avant et d’améliorer les méthodes état de l’art dans le domaine de la reconnaissance de l’écriture manuscrite. Nos travaux comparent différents systèmes permettant d’effectuer la reconnaissance de l’écriture manuscrite. Nous comparons en particulier différentes caractéristiques et différents classifieurs dynamiques : i) Modèles de Markov Cachés (MMC), ii) hybride réseaux de neurones/MMC, iii) hybride réseaux récurrents « Bidirectional Long Short Term Memory - Connectionist Temporal Classification » (BLSTM-CTC)/MMC et iv) hybride Champs Aléatoires Conditionnels (CAC)/MMC. Les comparaisons sont réalisées dans les conditions de la tâche WR2 de la compétition ICDAR 2009, c’est à dire une tâche de reconnaissance de mots isolés avec un dictionnaire de 1600 mots. Nous montrons la supériorité de l’hybride BLSTM-CTC/MMC sur les autres classifieurs dynamiques ainsi que la complémentarité des sorties des BLSTM-CTC utilisant différentes caractéristiques.Notre seconde contribution vise à exploiter ces complémentarités. Nous explorons des stratégies de combinaisons opérant à différents niveaux de la structure des BLSTM-CTC : bas niveau (en entrée), moyen niveau (dans le réseau), haut niveau (en sortie). Nous nous plaçons de nouveau dans les conditions de la tâche WR2 de la compétition ICDAR 2009. De manière générale nos combinaisons améliorent les résultats par rapport aux systèmes individuels, et nous avoisinons les performances du meilleur système de la compétition. Nous avons observé que certaines combinaisons sont adaptées à des systèmes sans lexique tandis que d’autres sont plus appropriées pour des systèmes avec lexique. Notre troisième contribution se situe sur deux applications liées à la reconnaissance de l’écriture. Nous présentons un système de reconnaissance de la langue ainsi qu’un système de détection de mots clefs, à partir de requêtes images et de requêtes de texte. Dans ces deux applications nous présentons une approche originale faisant appel à la reconnaissance de l’écriture. En effet la plupart des systèmes de la littérature extraient des caractéristiques des image pour déterminer une langue ou trouver des images similaires, ce qui n’est pas nécessairement l’approche la plus adaptée au problème à traiter. Nos approches se basent sur une phase de reconnaissance de l’écriture puis une analyse du texte afin de déterminer la langue ou de détecter un mot clef recherché
    corecore