1,662 research outputs found
Field typing for improved recognition on heterogeneous handwritten forms
Offline handwriting recognition has undergone continuous progress over the
past decades. However, existing methods are typically benchmarked on free-form
text datasets that are biased towards good-quality images and handwriting
styles, and homogeneous content. In this paper, we show that state-of-the-art
algorithms, employing long short-term memory (LSTM) layers, do not readily
generalize to real-world structured documents, such as forms, due to their
highly heterogeneous and out-of-vocabulary content, and to the inherent
ambiguities of this content. To address this, we propose to leverage the
content type within an LSTM-based architecture. Furthermore, we introduce a
procedure to generate synthetic data to train this architecture without
requiring expensive manual annotations. We demonstrate the effectiveness of our
approach at transcribing text on a challenging, real-world dataset of European
Accident Statements
Multimodal Interactive Transcription of Handwritten Text Images
En esta tesis se presenta un nuevo marco interactivo y multimodal para la transcripción de
Documentos manuscritos. Esta aproximación, lejos de proporcionar la transcripción completa
pretende asistir al experto en la dura tarea de transcribir.
Hasta la fecha, los sistemas de reconocimiento de texto manuscrito disponibles no proporcionan
transcripciones aceptables por los usuarios y, generalmente, se requiere la intervención
del humano para corregir las transcripciones obtenidas. Estos sistemas han demostrado ser
realmente útiles en aplicaciones restringidas y con vocabularios limitados (como es el caso
del reconocimiento de direcciones postales o de cantidades numéricas en cheques bancarios),
consiguiendo en este tipo de tareas resultados aceptables. Sin embargo, cuando se trabaja
con documentos manuscritos sin ningún tipo de restricción (como documentos manuscritos
antiguos o texto espontáneo), la tecnología actual solo consigue resultados inaceptables.
El escenario interactivo estudiado en esta tesis permite una solución más efectiva. En este
escenario, el sistema de reconocimiento y el usuario cooperan para generar la transcripción final
de la imagen de texto. El sistema utiliza la imagen de texto y una parte de la transcripción
previamente validada (prefijo) para proponer una posible continuación. Despues, el usuario
encuentra y corrige el siguente error producido por el sistema, generando así un nuevo prefijo
mas largo. Este nuevo prefijo, es utilizado por el sistema para sugerir una nueva hipótesis. La
tecnología utilizada se basa en modelos ocultos de Markov y n-gramas. Estos modelos son
utilizados aquí de la misma manera que en el reconocimiento automático del habla. Algunas
modificaciones en la definición convencional de los n-gramas han sido necesarias para tener
en cuenta la retroalimentación del usuario en este sistema.Romero Gómez, V. (2010). Multimodal Interactive Transcription of Handwritten Text Images [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8541Palanci
Integrating passive ubiquitous surfaces into human-computer interaction
Mobile technologies enable people to interact with computers ubiquitously. This dissertation investigates how ordinary, ubiquitous surfaces can be integrated into human-computer interaction to extend the interaction space beyond the edge of the display. It turns out that acoustic and tactile features generated during an interaction can be combined to identify input events, the user, and the surface. In addition, it is shown that a heterogeneous distribution of different surfaces is particularly suitable for realizing versatile interaction modalities. However, privacy concerns must be considered when selecting sensors, and context can be crucial in determining whether and what interaction to perform.Mobile Technologien ermöglichen den Menschen eine allgegenwärtige Interaktion mit Computern. Diese Dissertation untersucht, wie gewöhnliche, allgegenwärtige Oberflächen in die Mensch-Computer-Interaktion integriert werden können, um den Interaktionsraum über den Rand des Displays hinaus zu erweitern. Es stellt sich heraus, dass akustische und taktile Merkmale, die während einer Interaktion erzeugt werden, kombiniert werden können, um Eingabeereignisse, den Benutzer und die Oberfläche zu identifizieren. Darüber hinaus wird gezeigt, dass eine heterogene Verteilung verschiedener Oberflächen besonders geeignet ist, um vielfältige Interaktionsmodalitäten zu realisieren. Bei der Auswahl der Sensoren müssen jedoch Datenschutzaspekte berücksichtigt werden, und der Kontext kann entscheidend dafür sein, ob und welche Interaktion durchgeführt werden soll
The power of writing hands : logical memory performance after handwriting and typing tasks with Wechsler Memory Scale revised edition
Information and communications technologies have generated a multilevel metamorphose not only of the educational field, but also of the usage of hands. The shift from handwriting to typing is bringing about a change in the ways people learn to recognize and recollect letters and words, read and write.
This study investigates how different writing methods affect memory retrieval. The aim is to understand how the memory performances compare after handwriting and typing tasks, and how the factor of time or age affects recollection. The Wechsler Memory Scale Revised Edition (WMS-R) was used with experimental within-subjects research design to measure memory functions of 31 University of Lapland students in 2016. Participants wrote down a dictated story with a pencil, computer keyboard, and a touch screen keyboard. Consequently, the degree of recollection of each writing task was measured and analysed with repeated measures analysis of variance.
Additionally, this thesis deliberates the embodied cognition theory, as learning and memorizing are not simply information processing in nothingness. Experiences, actions and senses all play part in learning, as well as in writing process with the harmonious co-operation of brain, mind and body.
The results of this study indicate that writing modalities have statistically significant effect on recollection, handwriting receiving the highest scores. These results are of interest due to the constant increase of digitalization of learning environments. Moreover, these results can be reflected upon when evaluating the impending changes in the Finnish curriculum, from which cursive handwriting is removed in autumn 2016
NEW APPROACH FOR ONLINE ARABIC MANUSCRIPT RECOGNITION BY DEEP BELIEF NETWORK
In this paper, we present a neural approach for an unconstrained Arabic manuscript recognition using the online writing signal rather than images. First, we build the database which contains 2800 characters and 4800 words collected from 20 different handwritings. Thereafter, we will perform the pretreatment, feature extraction and classification phases, respectively. The use of a classical neural network methods has been beneficial for the character recognition, but revealed some limitations for the recognition rate of Arabic words. To remedy this, we used a deep learning through the Deep Belief Network (DBN) that resulted in a 97.08% success rate of recognition for Arabic words
Using Technology Enabled Qualitative Research to Develop Products for the Social Good, An Overview
This paper discusses the potential benefits of the convergence of three recent trends for the design of socially beneficial products and services: the increasing application of qualitative research techniques in a wide range of disciplines, the rapid mainstreaming of social media and mobile technologies, and the emergence of software as a service. Presented is a scenario facilitating the complex data collection, analysis, storage, and reporting required for the qualitative research recommended for the task of designing relevant solutions to address needs of the underserved. A pilot study is used as a basis for describing the infrastructure and services required to realize this scenario. Implications for innovation of enhanced forms of qualitative research are presented
Biometric Systems
Biometric authentication has been widely used for access control and security systems over the past few years. The purpose of this book is to provide the readers with life cycle of different biometric authentication systems from their design and development to qualification and final application. The major systems discussed in this book include fingerprint identification, face recognition, iris segmentation and classification, signature verification and other miscellaneous systems which describe management policies of biometrics, reliability measures, pressure based typing and signature verification, bio-chemical systems and behavioral characteristics. In summary, this book provides the students and the researchers with different approaches to develop biometric authentication systems and at the same time includes state-of-the-art approaches in their design and development. The approaches have been thoroughly tested on standard databases and in real world applications
The state of MIIND
MIIND (Multiple Interacting Instantiations of Neural Dynamics) is a highly modular multi-level C++ framework, that aims to shorten the development time for models in Cognitive Neuroscience (CNS). It offers reusable code modules (libraries of classes and functions) aimed at solving problems that occur repeatedly in modelling, but tries not to impose a specific modelling philosophy or methodology. At the lowest level, it offers support for the implementation of sparse networks. For example, the library SparseImplementationLib supports sparse random networks and the library LayerMappingLib can be used for sparse regular networks of filter-like operators. The library DynamicLib, which builds on top of the library SparseImplementationLib, offers a generic framework for simulating network processes. Presently, several specific network process implementations are provided in MIIND: the Wilson–Cowan and Ornstein–Uhlenbeck type, and population density techniques for leaky-integrate-and-fire neurons driven by Poisson input. A design principle of MIIND is to support detailing: the refinement of an originally simple model into a form where more biological detail is included. Another design principle is extensibility: the reuse of an existing model in a larger, more extended one. One of the main uses of MIIND so far has been the instantiation of neural models of visual attention. Recently, we have added a library for implementing biologically-inspired models of artificial vision, such as HMAX and recent successors. In the long run we hope to be able to apply suitably adapted neuronal mechanisms of attention to these artificial models
- …