1,662 research outputs found

    Field typing for improved recognition on heterogeneous handwritten forms

    Full text link
    Offline handwriting recognition has undergone continuous progress over the past decades. However, existing methods are typically benchmarked on free-form text datasets that are biased towards good-quality images and handwriting styles, and homogeneous content. In this paper, we show that state-of-the-art algorithms, employing long short-term memory (LSTM) layers, do not readily generalize to real-world structured documents, such as forms, due to their highly heterogeneous and out-of-vocabulary content, and to the inherent ambiguities of this content. To address this, we propose to leverage the content type within an LSTM-based architecture. Furthermore, we introduce a procedure to generate synthetic data to train this architecture without requiring expensive manual annotations. We demonstrate the effectiveness of our approach at transcribing text on a challenging, real-world dataset of European Accident Statements

    Multimodal Interactive Transcription of Handwritten Text Images

    Full text link
    En esta tesis se presenta un nuevo marco interactivo y multimodal para la transcripción de Documentos manuscritos. Esta aproximación, lejos de proporcionar la transcripción completa pretende asistir al experto en la dura tarea de transcribir. Hasta la fecha, los sistemas de reconocimiento de texto manuscrito disponibles no proporcionan transcripciones aceptables por los usuarios y, generalmente, se requiere la intervención del humano para corregir las transcripciones obtenidas. Estos sistemas han demostrado ser realmente útiles en aplicaciones restringidas y con vocabularios limitados (como es el caso del reconocimiento de direcciones postales o de cantidades numéricas en cheques bancarios), consiguiendo en este tipo de tareas resultados aceptables. Sin embargo, cuando se trabaja con documentos manuscritos sin ningún tipo de restricción (como documentos manuscritos antiguos o texto espontáneo), la tecnología actual solo consigue resultados inaceptables. El escenario interactivo estudiado en esta tesis permite una solución más efectiva. En este escenario, el sistema de reconocimiento y el usuario cooperan para generar la transcripción final de la imagen de texto. El sistema utiliza la imagen de texto y una parte de la transcripción previamente validada (prefijo) para proponer una posible continuación. Despues, el usuario encuentra y corrige el siguente error producido por el sistema, generando así un nuevo prefijo mas largo. Este nuevo prefijo, es utilizado por el sistema para sugerir una nueva hipótesis. La tecnología utilizada se basa en modelos ocultos de Markov y n-gramas. Estos modelos son utilizados aquí de la misma manera que en el reconocimiento automático del habla. Algunas modificaciones en la definición convencional de los n-gramas han sido necesarias para tener en cuenta la retroalimentación del usuario en este sistema.Romero Gómez, V. (2010). Multimodal Interactive Transcription of Handwritten Text Images [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8541Palanci

    Integrating passive ubiquitous surfaces into human-computer interaction

    Get PDF
    Mobile technologies enable people to interact with computers ubiquitously. This dissertation investigates how ordinary, ubiquitous surfaces can be integrated into human-computer interaction to extend the interaction space beyond the edge of the display. It turns out that acoustic and tactile features generated during an interaction can be combined to identify input events, the user, and the surface. In addition, it is shown that a heterogeneous distribution of different surfaces is particularly suitable for realizing versatile interaction modalities. However, privacy concerns must be considered when selecting sensors, and context can be crucial in determining whether and what interaction to perform.Mobile Technologien ermöglichen den Menschen eine allgegenwärtige Interaktion mit Computern. Diese Dissertation untersucht, wie gewöhnliche, allgegenwärtige Oberflächen in die Mensch-Computer-Interaktion integriert werden können, um den Interaktionsraum über den Rand des Displays hinaus zu erweitern. Es stellt sich heraus, dass akustische und taktile Merkmale, die während einer Interaktion erzeugt werden, kombiniert werden können, um Eingabeereignisse, den Benutzer und die Oberfläche zu identifizieren. Darüber hinaus wird gezeigt, dass eine heterogene Verteilung verschiedener Oberflächen besonders geeignet ist, um vielfältige Interaktionsmodalitäten zu realisieren. Bei der Auswahl der Sensoren müssen jedoch Datenschutzaspekte berücksichtigt werden, und der Kontext kann entscheidend dafür sein, ob und welche Interaktion durchgeführt werden soll

    The power of writing hands : logical memory performance after handwriting and typing tasks with Wechsler Memory Scale revised edition

    Get PDF
    Information and communications technologies have generated a multilevel metamorphose not only of the educational field, but also of the usage of hands. The shift from handwriting to typing is bringing about a change in the ways people learn to recognize and recollect letters and words, read and write. This study investigates how different writing methods affect memory retrieval. The aim is to understand how the memory performances compare after handwriting and typing tasks, and how the factor of time or age affects recollection. The Wechsler Memory Scale Revised Edition (WMS-R) was used with experimental within-subjects research design to measure memory functions of 31 University of Lapland students in 2016. Participants wrote down a dictated story with a pencil, computer keyboard, and a touch screen keyboard. Consequently, the degree of recollection of each writing task was measured and analysed with repeated measures analysis of variance. Additionally, this thesis deliberates the embodied cognition theory, as learning and memorizing are not simply information processing in nothingness. Experiences, actions and senses all play part in learning, as well as in writing process with the harmonious co-operation of brain, mind and body. The results of this study indicate that writing modalities have statistically significant effect on recollection, handwriting receiving the highest scores. These results are of interest due to the constant increase of digitalization of learning environments. Moreover, these results can be reflected upon when evaluating the impending changes in the Finnish curriculum, from which cursive handwriting is removed in autumn 2016

    NEW APPROACH FOR ONLINE ARABIC MANUSCRIPT RECOGNITION BY DEEP BELIEF NETWORK

    Get PDF
    In this paper, we present a neural approach for an unconstrained Arabic manuscript recognition using the online writing signal rather than images. First, we build the database which contains 2800 characters and 4800 words collected from 20 different handwritings. Thereafter, we will perform the pretreatment, feature extraction and classification phases, respectively. The use of a classical neural network methods has been beneficial for the character recognition, but revealed some limitations for the recognition rate of Arabic words. To remedy this, we used a deep learning through the Deep Belief Network (DBN) that resulted in a 97.08% success rate of recognition for Arabic words

    Using Technology Enabled Qualitative Research to Develop Products for the Social Good, An Overview

    Get PDF
    This paper discusses the potential benefits of the convergence of three recent trends for the design of socially beneficial products and services: the increasing application of qualitative research techniques in a wide range of disciplines, the rapid mainstreaming of social media and mobile technologies, and the emergence of software as a service. Presented is a scenario facilitating the complex data collection, analysis, storage, and reporting required for the qualitative research recommended for the task of designing relevant solutions to address needs of the underserved. A pilot study is used as a basis for describing the infrastructure and services required to realize this scenario. Implications for innovation of enhanced forms of qualitative research are presented

    Biometric Systems

    Get PDF
    Biometric authentication has been widely used for access control and security systems over the past few years. The purpose of this book is to provide the readers with life cycle of different biometric authentication systems from their design and development to qualification and final application. The major systems discussed in this book include fingerprint identification, face recognition, iris segmentation and classification, signature verification and other miscellaneous systems which describe management policies of biometrics, reliability measures, pressure based typing and signature verification, bio-chemical systems and behavioral characteristics. In summary, this book provides the students and the researchers with different approaches to develop biometric authentication systems and at the same time includes state-of-the-art approaches in their design and development. The approaches have been thoroughly tested on standard databases and in real world applications

    The state of MIIND

    Get PDF
    MIIND (Multiple Interacting Instantiations of Neural Dynamics) is a highly modular multi-level C++ framework, that aims to shorten the development time for models in Cognitive Neuroscience (CNS). It offers reusable code modules (libraries of classes and functions) aimed at solving problems that occur repeatedly in modelling, but tries not to impose a specific modelling philosophy or methodology. At the lowest level, it offers support for the implementation of sparse networks. For example, the library SparseImplementationLib supports sparse random networks and the library LayerMappingLib can be used for sparse regular networks of filter-like operators. The library DynamicLib, which builds on top of the library SparseImplementationLib, offers a generic framework for simulating network processes. Presently, several specific network process implementations are provided in MIIND: the Wilson–Cowan and Ornstein–Uhlenbeck type, and population density techniques for leaky-integrate-and-fire neurons driven by Poisson input. A design principle of MIIND is to support detailing: the refinement of an originally simple model into a form where more biological detail is included. Another design principle is extensibility: the reuse of an existing model in a larger, more extended one. One of the main uses of MIIND so far has been the instantiation of neural models of visual attention. Recently, we have added a library for implementing biologically-inspired models of artificial vision, such as HMAX and recent successors. In the long run we hope to be able to apply suitably adapted neuronal mechanisms of attention to these artificial models
    corecore