21 research outputs found

    Convertisseur d'équations LATEX2Ink

    Get PDF
    International audienceDans cet article nous présentons un outil de génération de formules mathématiques manuscrites en ligne à partir d'une chaîne LATEX. Ce générateur permettra facilement de fabriquer à partir d'un corpus de référence d'expressions mathématiques une base de données qui sera annotée automatiquement au niveau symbole. Ainsi, à partir d'une base de symboles isolés, nous pouvons produire de façon pseudo-synthétique une formule mathématique quelconque par un placement et un dimensionnement stochastiques 2D de ces éléments. Nous montrons l'intérêt de cet outil dans le cadre d'un projet visant à la conception d'une méthode adaptée à la reconnaissance et à l'interprétation d'expressions mathématiques en-ligne

    Document image segmentation using a multiresolution approach. Accurate text line extraction

    Get PDF
    An overall scheme and related algorithms performing accurate text lines extraction from an image of document are described in this paper The type of documents concerned here is very complex, with totally unconstrained data . Postal objects, especially the so-calledfiatobjects, i.e. : large envelopes, magazines, . . . are within this kind of documents. Three main phases have been considered to achieve the overall function . First of all, areas of interest are located using a multiresolution approach allowing to preserve from large variability of text features . This is performed directly on the gray-level image . A binarization stage, taking advantage of the results of the localization, is next performed to extract the lines . At last, a post-segmentation involving the located areas in the gray-level images and structural features extracted from the lines allows to deal with severe cases such as overlapping lines induced by handwritten texts . Examples related to text line extraction on postal objects are illustrating this paper.Cet article présente une méthodologie et des outils de traitement permettant de localiser puis d'extraire précisément les lignes de texte contenues dans l'image d'un document. La classe des documents visés est de type document très complexe, leurs contenus étant totalement non contraints. Globalement la méthodologie s'articule autour de trois étapes clés. La première est une localisation des zones d'intérêt. Elle est réalisée directement sur l'image en niveaux de gris et utilise une approche multirésolution garantissant une grande robustesse vis-à-vis de la très forte variabilité des textes: taille, disposition, présentation. Une étape de binarisation réalisée séparément pour chaque zone d'intérêt permet dans une seconde phase l'extraction proprement dite des lignes de texte. Enfin, une post-segmentation faisant coopérer la localisation initiale et des caractéristiques structurelles extraites de la ligne permet de traiter les cas très perturbants pour la lecture du chevauchement de lignes sur de l'écriture manuscrite. Des exemples relevant de la problématique de l'extraction des lignes du bloc adresse sur objets postaux (grandes lettres, magazines) illustrent cet article

    Offline Cursive Handwriting Recognition System based on Hybrid Markov Model and Neural Networks

    Get PDF
    An offline cursive handwritten recognition system, based on hybrid of Neu Networks (NN) and Hidden markov Models (HMM), is decribed in this paper. Applying SegRec principle, the recognizer does not make hard decision at the character segmentation process. Instead, it delays the character segmantation to the recognition stage by generating a segmentation graph that decribes all possible ways to segment a word into letters. To recognize a word, the NN computes the observation probabilities for each segmentation candidates SCs in the segmentation graph. Then, using concatenated letters-HMMs, a likelihood is computed for each word in the lexicon by multiplying the possibilities over the best paths through the graph. We present in detail two approaches to train the word recognizer:1)character-level training 2) word-level training. The recognigtion performance of the two systems are discussed

    Apprentissages discriminants en reconnaissance de mots cursifs en-ligne

    No full text
    Ce papier décrit différents modes d'apprentissage de systèmes hybrides basés sur un schéma neuro-markovien (TDNN multi-états -HMM) appliqués à la reconnaissance de mots cursifs saisis en-ligne. Nous avons considéré différentes fonctions de coût, incluant à la fois des critères d'information mutuelle (MMI) avec un apprentissage discriminant et une estimation du maximum de vraisemblance, pour entraîner le système globalement au niveau mot. Nous avons analysé l'impact de la modélisation markovienne en variant de un à trois le nombre d'états d'un modèle de Markov caché lettre HMM lettre). Plusieurs expérimentations sur ces critères et modélisations ont été menées sur la base IRONOFF dans un contexte de reconnaissance mots non contraints et omni scripteurs et sont retranscrits dans ce papier

    From Off-line to On-line Handwriting Recognition

    Get PDF
    On-line handwriting includes more information on time order of the writing signal and on the dynamics of the writing process than off-line handwriting. Therefore, on-line recognition systems achieve higher recognition rates. This can be concluded from results reported in the literature, and has been demonstrated empirically as part of this work. We propose a new approach for recovering the time order of the off-line handwriting signal. Starting from an over-segmentation of the off-line handwriting into regular and singular parts, the time ordering of these parts and recognition of the word are performed simultaneously. This approach, termed ``OrdRec\\\'\\\', is based on a graph description of the handwriting signal and a recognition process using Hidden Markov Models (HMM). A complete omni-scriptor isolated word recognition system has been developed. Using a dynamic lexicon and models for upper and lower case characters, our system can process binary and gray value word images of any writing style (script, cursive or mixed). Using a dual handwriting data base which features both the on-line and the off-line signal for each of the 30000 words written by about 700 scriptors, we have shown experimentally that such an off-line recognition system, using the recovered time order information, can achieve recognition performances close to those of an on-line recognition system

    Lexicon-based word recognition using support vector machine and hidden markov model

    No full text
    Hybrid of Neural Network (NN) and Hidden Markov Model (HMM) has been popular in word recognition, taking advantage of NN discriminative property and HMM representational capability. However, NN does not guarantee good generalization due to Empirical Risk minimization (ERM) principle that it uses. In our work, we focus on online word recognition using the support vector machine (SVM) for character recognition. SVM's use of structural risk minimization (SRM) principle has allowed simultaneous optimization of representational and discriminative capability of the character recognizer. We evaluated SVM in isolated character recognition environment using IRONOFF and UNIPEN character database. We then demonstrate the practical issues in using SVM within a hybrid setting with HMM for word recognition by testing the hybrid system on the IRONOFF word database and obtained commendable results
    corecore