11 research outputs found

    Exploiting the Two-Dimensional Nature of Agnostic Music Notation for Neural Optical Music Recognition

    Get PDF
    State-of-the-art Optical Music Recognition (OMR) techniques follow an end-to-end or holistic approach, i.e., a sole stage for completely processing a single-staff section image and for retrieving the symbols that appear therein. Such recognition systems are characterized by not requiring an exact alignment between each staff and their corresponding labels, hence facilitating the creation and retrieval of labeled corpora. Most commonly, these approaches consider an agnostic music representation, which characterizes music symbols by their shape and height (vertical position in the staff). However, this double nature is ignored since, in the learning process, these two features are treated as a single symbol. This work aims to exploit this trademark that differentiates music notation from other similar domains, such as text, by introducing a novel end-to-end approach to solve the OMR task at a staff-line level. We consider two Convolutional Recurrent Neural Network (CRNN) schemes trained to simultaneously extract the shape and height information and to propose different policies for eventually merging them at the actual neural level. The results obtained for two corpora of monophonic early music manuscripts prove that our proposal significantly decreases the recognition error in figures ranging between 14.4% and 25.6% in the best-case scenarios when compared to the baseline considered.This research work was partially funded by the University of Alicante through project GRE19-04, by the “Programa I+D+i de la Generalitat Valenciana” through grant APOSTD/2020/256, and by the Spanish Ministerio de Universidades through grant FPU19/04957

    Transcription of Spanish Historical Handwritten Documents with Deep Neural Networks

    Full text link
    [EN] The digitization of historical handwritten document images is important for the preservation of cultural heritage. Moreover, the transcription of text images obtained from digitization is necessary to provide efficient information access to the content of these documents. Handwritten Text Recognition (HTR) has become an important research topic in the areas of image and computational language processing that allows us to obtain transcriptions from text images. State-of-the-art HTR systems are, however, far from perfect. One difficulty is that they have to cope with image noise and handwriting variability. Another difficulty is the presence of a large amount of Out-Of-Vocabulary (OOV) words in ancient historical texts. A solution to this problem is to use external lexical resources, but such resources might be scarce or unavailable given the nature and the age of such documents. This work proposes a solution to avoid this limitation. It consists of associating a powerful optical recognition system that will cope with image noise and variability, with a language model based on sub-lexical units that will model OOV words. Such a language modeling approach reduces the size of the lexicon while increasing the lexicon coverage. Experiments are first conducted on the publicly available Rodrigo dataset, which contains the digitization of an ancient Spanish manuscript, with a recognizer based on Hidden Markov Models (HMMs). They show that sub-lexical units outperform word units in terms of Word Error Rate (WER), Character Error Rate (CER) and OOV word accuracy rate. This approach is then applied to deep net classifiers, namely Bi-directional Long-Short Term Memory (BLSTMs) and Convolutional Recurrent Neural Nets (CRNNs). Results show that CRNNs outperform HMMs and BLSTMs, reaching the lowest WER and CER for this image dataset and significantly improving OOV recognition.Work partially supported by projects READ: Recognition and Enrichment of Archival Documents - 674943 (European Union's H2020) and CoMUN-HaT: Context, Multimodality and User Collaboration in Handwritten Text Processing - TIN2015-70924-C2-1-R (MINECO/FEDER), and a DGA-MRIS (Direction Generale de l'Armement - Mission pour la Recherche et l'Innovation Scientifique) scholarship.Granell, E.; Chammas, E.; Likforman-Sulem, L.; MartĂ­nez-Hinarejos, C.; Mokbel, C.; Cirstea, B. (2018). Transcription of Spanish Historical Handwritten Documents with Deep Neural Networks. Journal of imaging. 4(1). https://doi.org/10.3390/jimaging4010015S154

    Hybrid hidden Markov models and artificial neural networks for handwritten music recognition in mensural notation

    Get PDF
    In this paper, we present a hybrid approach using hidden Markov models (HMM) and artificial neural networks to deal with the task of handwritten Music Recognition in mensural notation. Previous works have shown that the task can be addressed with Gaussian density HMMs that can be trained and used in an end-to-end manner, that is, without prior segmentation of the symbols. However, the results achieved using that approach are not sufficiently accurate to be useful in practice. In this work, we hybridize HMMs with deep multilayer perceptrons (MLPs), which lead to remarkable improvements in optical symbol modeling. Moreover, this hybrid architecture maintains important advantages of HMMs such as the ability to properly model variable-length symbol sequences through segmentation-free training, and the simplicity and robustness of combining optical models with N-gram language models, which provide statistical a priori information about regularities in musical symbol concatenation observed in the training data. The results obtained with the proposed hybrid MLP-HMM approach outperform previous works by a wide margin, achieving symbol-level error rates around 26%, as compared with about 40% reported in previous works

    A sequential handwriting recognition model based on a dynamically configurable CRNN

    Get PDF
    Handwriting recognition refers to recognizing a handwritten input that includes character(s) or digit(s) based on an image. Because most applications of handwriting recognition in real life contain sequential text in various languages, there is a need to develop a dynamic handwriting recognition system. Inspired by the neuroevolutionary technique, this paper proposes a Dynamically Configurable Convolutional Recurrent Neural Network (DC-CRNN) for the handwriting recognition sequence modeling task. The proposed DC-CRNN is based on the Salp Swarm Optimization Algorithm (SSA), which generates the optimal structure and hyperparameters for Convolutional Recurrent Neural Networks (CRNNs). In addition, we investigate two types of encoding techniques used to translate the output of optimization to a CRNN recognizer. Finally, we proposed a novel hybridized SSA with Late Acceptance Hill-Climbing (LAHC) to improve the exploitation process. We conducted our experiments on two well-known datasets, IAM and IFN/ENIT, which include both the Arabic and English languages. The experimental results have shown that LAHC significantly improves the SSA search process. Therefore, the proposed DC-CRNN outperforms the handcrafted CRNN methods

    Graphonomics and your Brain on Art, Creativity and Innovation : Proceedings of the 19th International Graphonomics Conference (IGS 2019 – Your Brain on Art)

    Get PDF
    [Italiano]: “Grafonomia e cervello su arte, creatività e innovazione”. Un forum internazionale per discutere sui recenti progressi nell'interazione tra arti creative, neuroscienze, ingegneria, comunicazione, tecnologia, industria, istruzione, design, applicazioni forensi e mediche. I contributi hanno esaminato lo stato dell'arte, identificando sfide e opportunità, e hanno delineato le possibili linee di sviluppo di questo settore di ricerca. I temi affrontati includono: strategie integrate per la comprensione dei sistemi neurali, affettivi e cognitivi in ambienti realistici e complessi; individualità e differenziazione dal punto di vista neurale e comportamentale; neuroaesthetics (uso delle neuroscienze per spiegare e comprendere le esperienze estetiche a livello neurologico); creatività e innovazione; neuro-ingegneria e arte ispirata dal cervello, creatività e uso di dispositivi di mobile brain-body imaging (MoBI) indossabili; terapia basata su arte creativa; apprendimento informale; formazione; applicazioni forensi. / [English]: “Graphonomics and your brain on art, creativity and innovation”. A single track, international forum for discussion on recent advances at the intersection of the creative arts, neuroscience, engineering, media, technology, industry, education, design, forensics, and medicine. The contributions reviewed the state of the art, identified challenges and opportunities and created a roadmap for the field of graphonomics and your brain on art. The topics addressed include: integrative strategies for understanding neural, affective and cognitive systems in realistic, complex environments; neural and behavioral individuality and variation; neuroaesthetics (the use of neuroscience to explain and understand the aesthetic experiences at the neurological level); creativity and innovation; neuroengineering and brain-inspired art, creative concepts and wearable mobile brain-body imaging (MoBI) designs; creative art therapy; informal learning; education; forensics
    corecore