393 research outputs found

    Bayesian Action–Perception Computational Model: Interaction of Production and Recognition of Cursive Letters

    Get PDF
    In this paper, we study the collaboration of perception and action representations involved in cursive letter recognition and production. We propose a mathematical formulation for the whole perception–action loop, based on probabilistic modeling and Bayesian inference, which we call the Bayesian Action–Perception (BAP) model. Being a model of both perception and action processes, the purpose of this model is to study the interaction of these processes. More precisely, the model includes a feedback loop from motor production, which implements an internal simulation of movement. Motor knowledge can therefore be involved during perception tasks. In this paper, we formally define the BAP model and show how it solves the following six varied cognitive tasks using Bayesian inference: i) letter recognition (purely sensory), ii) writer recognition, iii) letter production (with different effectors), iv) copying of trajectories, v) copying of letters, and vi) letter recognition (with internal simulation of movements). We present computer simulations of each of these cognitive tasks, and discuss experimental predictions and theoretical developments

    Self-supervised adaptation for on-line script text recognition

    Get PDF
    We have recently developed in our lab a text recognizer for on-line texts written on a touch-terminal. We present in this paper several strategies to adapt this recognizer in a self-supervised way to a given writer and compare them to the supervised adaptation scheme. The baseline system is based on the activation-verification cognitive model. We have designed this recognizer to be writer-independent but it may be adapted to be writer-dependent in order to increase the recognition speed and rate. The classification expert can be iteratively modified in order to learn the particularities of a writer. The best self-supervised adaptation strategy is called prototype dynamic management and gets good results, close to those of the supervised methods. The combination of supervised and self-supervised strategies increases accuracy again. Results, presented on a large database of 90 texts (5,400 words) written by 38 different writers are very encouraging with an error rate lower than 10%

    A study on idiosyncratic handwriting with impact on writer identification

    Full text link
    © 2018 IEEE. In this paper, we study handwriting idiosyncrasy in terms of its structural eccentricity. In this study, our approach is to find idiosyncratic handwritten text components and model the idiosyncrasy analysis task as a machine learning problem supervised by human cognition. We employ the Inception network for this purpose. The experiments are performed on two publicly available databases and an in-house database of Bengali offline handwritten samples. On these samples, subjective opinion scores of handwriting idiosyncrasy are collected from handwriting experts. We have analyzed the handwriting idiosyncrasy on this corpus which comprises the perceptive ground-truth opinion. We also investigate the effect of idiosyncratic text on writer identification by using the SqueezeNet. The performance of our system is promising

    Human Reading Based Strategies for off-line Arabic Word Recognition

    Get PDF
    International audienceThis paper summarizes some techniques proposed for off-line Arabic word recognition. The point of view developed here concerns the human reading favoring an interactive mechanism between global memorization and local checking making easier the recognition of complex scripts as Arabic. According to this consideration, some specific papers are analyzed and their strategies commente

    Arabic natural language processing: handwriting recognition

    Get PDF
    International audienceThe automatic recognition of Arabic writing is a very young research discipline with very challenging and significant problems. Indeed, with the air of the Internet, of Multimedia, the recognition of Arabic is useful to contributing like its close disciplines, Latin writing recognition, speech recognition and Vision processing, in current applications around digital libraries, document security and in numerical data processing in general. Arabic is a Semitic language spoken and understood in various forms by millions of people throughout the Middle East and in Africa, and it is used by 234 million people worldwide. Furthermore, Arabic gave rise to several other alphabets like Farsi or Urdu increasing much the interest of this script. Farsi is the main language used in Iran and Afghanistan, and it is spoken by more than 110 million people, concerning also some people in Tajikistan, and Pakistan. Urdu is an Indo-Aryan language with about 104 million speakers. It is the national language of Pakistan and is closely related to Hindi, though a lot of Urdu vocabulary comes from Persian and Arabic, which is not the case for Hindi. Urdu has been written with a version of the Perso-Arabic script since the 12th century and is normally written in Nastaliq style

    Structure Extraction in Printed Documents Using Neural Approaches

    Get PDF
    This paper addresses the problem of layout and logical structure extraction from image documents. Two classes of approaches are first studied and discussed in general terms: data-driven and model-driven. In the latter, some specific approaches like rule-based or formal grammar are usually studied on very stereotyped documents providing honest results, while in the former artificial neural networks are often considered for small patterns with good results. Our understanding of these techniques let us to believe that a hybrid model is a more appropriate solution for structure extraction. Based on this standpoint, we proposed a Perceptive Neural Network based approach using a static topology that possesses the characteristics of a dynamic neural network. Thanks to its transparency, it allows a better representation of the model elements and the relationships between the logical and the physical components. Furthermore, it possesses perceptive cycles providing some capacities in data refinement and correction. Tested on several kinds of documents, the results are better than those of a static Multilayer Perceptron

    Information fusion and adaptation for on-line text recognition

    Get PDF
    In this paper, we present a new writer independent system dedicated to the automatic recognition of on-line hand-printed texts. This system uses a very large French lexicon (200000 words), which covers numerous fields of application. The recognition process is based on the activation-verification model proposed in perceptive psychology. A set of experts encodes the input signal and extracts probabilistic information at several levels of abstraction (geometrical and morphological). A neural expert generates a tree of segmentation hypotheses. It is explored by a probabilistic fusion expert that uses all the available information (geometrical, morphological and lexical) in order to provide the best transcription of the input signal. We experiment several strategies of self-supervised writer-adaptation on this system. The best one, called “dynamic self-supervised adaptation”, modifies the recognizer parameters continuously. It gets recognition results close to supervised methods. These results are evaluated on a database of 90 texts (5400 words) written by 38 different writers and are very encouraging as they reach a recognition rate of 90%.Dans cet article nous présentons un nouveau système de reconnaissance de textes manuscrits scripts en mode omni-scripteur. Ce système utilise un lexique français de très grande taille (200 000 mots), qui couvre de nombreux champs d'application. Le processus de reconnaissance repose sur le modèle d'activationvérification proposé en psychologie perceptive. Un ensemble d'experts code le signal d'entrée et extrait des informations probabilistes à différents niveaux d'abstraction (géométrique, morphologique). Un expert de segmentation neuronal génère un treillis d’hypothèses qui est exploré par un expert de fusion probabiliste qui utilise toute l’information disponible (géométrique, morphologique et lexicale) afin de fournir la meilleure retranscription du signal d’entrée. Nous avons expérimenté plusieurs stratégies d'adaptation non supervisée au scripteur. La meilleure, appelée « adaptation non-supervisée dynamique» agit en continu sur les paramètres du système. Elle permet d'atteindre des performances proches de l’une adaptation supervisée. Les performances, évaluées sur une base de données comportant 90 textes (5 400 mots) écrits par 38 utilisateurs différents, sont très encourageantes car elles atteignent un taux de reconnaissance de 90%
    corecore