429 research outputs found

    Recognition of pen-based music notation with finite-state machines

    Get PDF
    This work presents a statistical model to recognize pen-based music compositions using stroke recognition algorithms and finite-state machines. The series of strokes received as input is mapped onto a stochastic representation, which is combined with a formal language that describes musical symbols in terms of stroke primitives. Then, a Probabilistic Finite-State Automaton is obtained, which defines probabilities over the set of musical sequences. This model is eventually crossed with a semantic language to avoid sequences that does not make musical sense. Finally, a decoding strategy is applied in order to output a hypothesis about the musical sequence actually written. Comprehensive experimentation with several decoding algorithms, stroke similarity measures and probability density estimators are tested and evaluated following different metrics of interest. Results found have shown the goodness of the proposed model, obtaining competitive performances in all metrics and scenarios considered.This work was supported by the Spanish Ministerio de Educación, Cultura y Deporte through a FPU Fellowship (Ref. AP2012–0939) and the Spanish Ministerio de Economía y Competitividad through the TIMuL Project (No. TIN2013-48152-C2-1-R, supported by UE FEDER funds)

    A Knowledge based segmentation algorithm for enhanced recognition of handwritten courtesy amounts

    Get PDF
    "March 1994."Includes bibliographical references (p. [23]-[24]).Supported by the Productivity From Information Technology (PROFIT) Research Initiative at MIT.Karim Hussein ... [et al.

    CHARACTER-LEVEL INTERACTIONS IN MULTIMODAL COMPUTER-ASSISTED TRANSCRIPTION OF TEXT IMAGES

    Full text link
    HTR systems don't achieve acceptable results in unconstrained applications. Therefore, it is convenient to use a system that allows the user to cooperate in the most confortable way with the system to generate a correct transcription. In this paper, multimodal interaction at character-level is studied.Martín-Albo Simón, D. (2011). CHARACTER-LEVEL INTERACTIONS IN MULTIMODAL COMPUTER-ASSISTED TRANSCRIPTION OF TEXT IMAGES. http://hdl.handle.net/10251/11313Archivo delegad

    Selective Evolutionary Generation Systems: Theory and Applications.

    Full text link
    This dissertation is devoted to the problem of behavior design, which is a generalization of the standard global optimization problem: instead of generating the optimizer, the generalization produces, on the space of candidate optimizers, a probability density function referred to as the behavior. The generalization depends on a parameter, the level of selectivity, such that as this parameter tends to infinity, the behavior becomes a delta function at the location of the global optimizer. The motivation for this generalization is that traditional off-line global optimization is non-resilient and non-opportunistic. That is, traditional global optimization is unresponsive to perturbations of the objective function. On-line optimization methods that are more resilient and opportunistic than their off-line counterparts typically consist of the computationally expensive sequential repetition of off-line techniques. A novel approach to inexpensive resilience and opportunism is to utilize the theory of Selective Evolutionary Generation Systems (SEGS), which sequentially and probabilistically selects a candidate optimizer based on the ratio of the fitness values of two candidates and the level of selectivity. Using time-homogeneous, irreducible, ergodic Markov chains to model a sequence of local, and hence inexpensive, dynamic transitions, this dissertation proves that such transitions result in behavior that is called rational; such behavior is desirable because it can lead to both efficient search for an optimizer as well as resilient and opportunistic behavior. The dissertation also identifies system-theoretic properties of the proposed scheme, including equilibria, their stability and their optimality. Moreover, this dissertation demonstrates that the canonical genetic algorithm with fitness proportional selection and the (1+1) evolutionary strategy are particular cases of the scheme. Applications in three areas illustrate the versatility of the SEGS theory: flight mechanics, control of dynamic systems, and artificial intelligence. The dissertation results touch upon several open problems in the fields of artificial life, complex systems, artificial intelligence, and robotics.Ph.D.Aerospace EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/77855/1/amenezes_1.pd

    Multimodal Interactive Transcription of Handwritten Text Images

    Full text link
    En esta tesis se presenta un nuevo marco interactivo y multimodal para la transcripción de Documentos manuscritos. Esta aproximación, lejos de proporcionar la transcripción completa pretende asistir al experto en la dura tarea de transcribir. Hasta la fecha, los sistemas de reconocimiento de texto manuscrito disponibles no proporcionan transcripciones aceptables por los usuarios y, generalmente, se requiere la intervención del humano para corregir las transcripciones obtenidas. Estos sistemas han demostrado ser realmente útiles en aplicaciones restringidas y con vocabularios limitados (como es el caso del reconocimiento de direcciones postales o de cantidades numéricas en cheques bancarios), consiguiendo en este tipo de tareas resultados aceptables. Sin embargo, cuando se trabaja con documentos manuscritos sin ningún tipo de restricción (como documentos manuscritos antiguos o texto espontáneo), la tecnología actual solo consigue resultados inaceptables. El escenario interactivo estudiado en esta tesis permite una solución más efectiva. En este escenario, el sistema de reconocimiento y el usuario cooperan para generar la transcripción final de la imagen de texto. El sistema utiliza la imagen de texto y una parte de la transcripción previamente validada (prefijo) para proponer una posible continuación. Despues, el usuario encuentra y corrige el siguente error producido por el sistema, generando así un nuevo prefijo mas largo. Este nuevo prefijo, es utilizado por el sistema para sugerir una nueva hipótesis. La tecnología utilizada se basa en modelos ocultos de Markov y n-gramas. Estos modelos son utilizados aquí de la misma manera que en el reconocimiento automático del habla. Algunas modificaciones en la definición convencional de los n-gramas han sido necesarias para tener en cuenta la retroalimentación del usuario en este sistema.Romero Gómez, V. (2010). Multimodal Interactive Transcription of Handwritten Text Images [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8541Palanci

    Intuitive Human-Robot Cooperation

    Get PDF
    Diese Dissertation beschäftigt sich mit der Mensch-Roboter Kooperation. Dabei wurde eine haptische Schnittstelle entworfen, die dem Benutzer mit Hilfe einer Taktilen Sprache eine weitere non-verbale Interaktionsmodalität zur Verfügung stellt. Außerdem wurde eine Methode zur proaktiven Planung und Ausführung von Roboterhandlungen auf Basis der geschätzten Intention des Menschen erforscht. Zusätzlich wurde eine adäquate Roboterarchitektur konzipiert und implementiert

    State-Regularized Recurrent Neural Networks to Extract Automata and Explain Predictions

    Full text link
    Recurrent neural networks are a widely used class of neural architectures. They have, however, two shortcomings. First, they are often treated as black-box models and as such it is difficult to understand what exactly they learn as well as how they arrive at a particular prediction. Second, they tend to work poorly on sequences requiring long-term memorization, despite having this capacity in principle. We aim to address both shortcomings with a class of recurrent networks that use a stochastic state transition mechanism between cell applications. This mechanism, which we term state-regularization, makes RNNs transition between a finite set of learnable states. We evaluate state-regularized RNNs on (1) regular languages for the purpose of automata extraction; (2) non-regular languages such as balanced parentheses and palindromes where external memory is required; and (3) real-word sequence learning tasks for sentiment analysis, visual object recognition and text categorisation. We show that state-regularization (a) simplifies the extraction of finite state automata that display an RNN's state transition dynamic; (b) forces RNNs to operate more like automata with external memory and less like finite state machines, which potentiality leads to a more structural memory; (c) leads to better interpretability and explainability of RNNs by leveraging the probabilistic finite state transition mechanism over time steps.Comment: To appear at IEEE Transactions on Pattern Analysis and Machine Intelligence. The extended version of State-Regularized Recurrent Neural Networks [arXiv:1901.08817
    corecore