Search CORE

20 research outputs found

Handwritten Music Recognition for Mensural notation with convolutional recurrent neural networks

Author: Alejandro H. Toselli
Amodei
Baró
Bluche
Calvo-Zaragoza
Calvo-Zaragoza
Calvo-Zaragoza
Calvo-Zaragoza
Cardoso
Dai
Enrique Vidal
Everingham
Graves
Graves
Hajič
Jorge Calvo-Zaragoza
Kneser
Maas
Pacha
Povey
Pugin
Rabiner
Ren
Sauvola
Shi
Srivastava
van der Wel
Vidal
Wen
Williams
Zeiler
Publication venue: 'Elsevier BV'
Publication date: 01/12/2019
Field of study

[EN] Optical Music Recognition is the technology that allows computers to read music notation, which is also referred to as Handwritten Music Recognition when it is applied over handwritten notation. This technology aims at efficiently transcribing written music into a representation that can be further processed by a computer. This is of special interest to transcribe the large amount of music written in early notations, such as the Mensural notation, since they represent largely unexplored heritage for the musicological community. Traditional approaches to this problem are based on complex strategies with many explicit rules that only work for one particular type of manuscript. Machine learning approaches offer the promise of generalizable solutions, based on learning from just labelled examples. However, previous research has not achieved sufficiently acceptable results for handwritten Mensural notation. In this work we propose the use of deep neural networks, namely convolutional recurrent neural networks, which have proved effective in other similar domains such as handwritten text recognition. Our experimental results achieve, for the first time, recognition results that can be considered effective for transcribing handwritten Mensural notation, decreasing the symbol-level error rate of previous approaches from 25.7% to 7.0%. (C) 2019 Elsevier B.V. All rights reserved.First author thanks the support from the Spanish Ministry "HISPAMUS" project (TIN2017-86576-R), partially funded by the EU. The other authors were supported by the European Union's H2020 grant "Recognition and Enrichment of Archival Documents" (Ref. 674943), by the BBVA Foundacion through the 2017-2018 and 2018-2019 Digital Humanities research grants "Carabela" and "HistWeather - Dos Siglos de Datos Cilmaticos", and by EU JPICH project "HOME - History Of Medieval Europe"(Spanish PEICTI Ref. PCI2018-093122).Calvo-Zaragoza, J.; Toselli, AH.; Vidal, E. (2019). Handwritten Music Recognition for Mensural notation with convolutional recurrent neural networks. Pattern Recognition Letters. 128:115-121. https://doi.org/10.1016/j.patrec.2019.08.021S11512112

Repositorio Institucional de la Universidad de Alicante

Crossref

RiuNet

Optical Music Recognition: State of the Art and Major Challenges

Author: Fazekas G
International Conference on Technologies for Music Notation and Representation
Shatri E
Publication venue
Publication date: 12/05/2020
Field of study

Optical Music Recognition (OMR) is concerned with transcribing sheet music into a machine-readable format. The transcribed copy should allow musicians to compose, play and edit music by taking a picture of a music sheet. Complete transcription of sheet music would also enable more efficient archival. OMR facilitates examining sheet music statistically or searching for patterns of notations, thus helping use cases in digital musicology too. Recently, there has been a shift in OMR from using conventional computer vision techniques towards a deep learning approach. In this paper, we review relevant works in OMR, including fundamental methods and significant outcomes, and highlight different stages of the OMR pipeline. These stages often lack standard input and output representation and standardised evaluation. Therefore, comparing different approaches and evaluating the impact of different processing methods can become rather complex. This paper provides recommendations for future work, addressing some of the highlighted issues and represents a position in furthering this important field of research

Queen Mary Research Online

Glyph and Position Classification of Music Symbols in Early Manuscripts

Author: Núñez Alcover Alicia
Publication venue
Publication date: 09/09/2019
Field of study

In this research, we study how to classify of handwritten music symbols in early music manuscripts written in white Mensural notation, a common notation system used since the fourteenth century and until the Renaissance. The field of Optical Music Recognition researches how to automate the reading of musical scores to transcribe its content to a structured digital format such as MIDI. When dealing with music manuscripts, the traditional workflow establishes two separate stages of detection and classification of musical symbols. In the classification stage, most of the research focuses on detecting musical symbols, without taking into account that a musical note is defined in two components: glyph and its position with respect to the staff. Our purpose will consist of the design and implementation of architectures in the field of Deep Learning, using Convolutional Neural Networks (CNNs) as well as its evaluation and comparison to determine which model provides the best performance in terms of efficiency and precision for its implementation in an interactive scenario

Repositorio Institucional de la Universidad de Alicante

Proceedings of the 4th International Workshop on Reading Music Systems

Author: Calvo-Zaragoza Jorge
Pacha Alexander
Shatri Elona
Publication venue
Publication date: 23/11/2022
Field of study

The International Workshop on Reading Music Systems (WoRMS) is a workshop that tries to connect researchers who develop systems for reading music, such as in the field of Optical Music Recognition, with other researchers and practitioners that could benefit from such systems, like librarians or musicologists. The relevant topics of interest for the workshop include, but are not limited to: Music reading systems; Optical music recognition; Datasets and performance evaluation; Image processing on music scores; Writer identification; Authoring, editing, storing and presentation systems for music scores; Multi-modal systems; Novel input-methods for music to produce written music; Web-based Music Information Retrieval services; Applications and projects; Use-cases related to written music. These are the proceedings of the 4th International Workshop on Reading Music Systems, held online on Nov. 18th 2022.Comment: Proceedings edited by Jorge Calvo-Zaragoza, Alexander Pacha and Elona Shatr

arXiv.org e-Print Archive

Region-based layout analysis of music score images

Author: Calvo-Zaragoza Jorge
Castellanos Francisco J.
Garrido Muñoz Carlos
Ríos-Vila Antonio
Publication venue: 'Elsevier BV'
Publication date: 11/01/2022
Field of study

The Layout Analysis (LA) stage is of vital importance to the correct performance of an Optical Music Recognition (OMR) system. It identifies the regions of interest, such as staves or lyrics, which must then be processed in order to transcribe their content. Despite the existence of modern approaches based on deep learning, an exhaustive study of LA in OMR has not yet been carried out with regard to the performance of different models, their generalization to different domains or, more importantly, their impact on subsequent stages of the pipeline. This work focuses on filling this gap in the literature by means of an experimental study of different neural architectures, music document types, and evaluation scenarios. The need for training data has also led to a proposal for a new semi-synthetic data-generation technique that enables the efficient applicability of LA approaches in real scenarios. Our results show that: (i) the choice of the model and its performance are crucial for the entire transcription process; (ii) the metrics commonly used to evaluate the LA stage do not always correlate with the final performance of the OMR system, and (iii) the proposed data-generation technique enables state-of-the-art results to be achieved with a limited set of labeled data.This paper is part of the I+D+i PID2020-118447RA-I00 (MultiScore) project funded by MCIN/AEI/10.13039/501100011033, Spain and the GV/2020/030, Spain project funded by the Generalitat Valenciana, Spain. The first and third authors acknowledge support from the “Programa I+D+i de la Generalitat Valenciana, Spain ” through grants ACIF/2019/042 and ACIF/2021/356, respectively

Repositorio Institucional de la Universidad de Alicante

arXiv.org e-Print Archive

Understanding Optical Music Recognition

Author: Calvo-Zaragoza Jorge
Hajič Jr. Jan
Pacha Alexander
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/2020
Field of study

For over 50 years, researchers have been trying to teach computers to read music notation, referred to as Optical Music Recognition (OMR). However, this field is still difficult to access for new researchers, especially those without a significant musical background: Few introductory materials are available, and, furthermore, the field has struggled with defining itself and building a shared terminology. In this work, we address these shortcomings by (1) providing a robust definition of OMR and its relationship to related fields, (2) analyzing how OMR inverts the music encoding process to recover the musical notation and the musical semantics from documents, and (3) proposing a taxonomy of OMR, with most notably a novel taxonomy of applications. Additionally, we discuss how deep learning affects modern OMR research, as opposed to the traditional pipeline. Based on this work, the reader should be able to attain a basic understanding of OMR: its objectives, its inherent structure, its relationship to other fields, the state of the art, and the research opportunities it affords

Repositorio Institucional de la Universidad de Alicante

arXiv.org e-Print Archive

Domain adaptation for staff-region retrieval of music score images

Author: Calvo-Zaragoza Jorge
Castellanos Francisco J.
Fujinaga Ichiro
Gallego Antonio-Javier
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/09/2022
Field of study

Optical music recognition (OMR) is the field that studies how to automatically read music notation from score images. One of the relevant steps within the OMR workflow is the staff-region retrieval. This process is a key step because any undetected staff will not be processed by the subsequent steps. This task has previously been addressed as a supervised learning problem in the literature; however, ground-truth data are not always available, so each new manuscript requires a preliminary manual annotation. This situation is one of the main bottlenecks in OMR, because of the countless number of existing manuscripts , and the associated manual labeling cost. With the aim of mitigating this issue, we propose the application of a domain adaptation technique, the so-called Domain-Adversarial Neural Network (DANN), based on a combination of a gradient reversal layer and a domain classifier in the inference neural architecture. The results from our experiments support the benefits of our proposed solution, obtaining improvements of approximately 29% in the F-score.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This paper is part of the I+D+i PID2020-118447RA-I00 (MultiScore) project funded by MCIN/AEI/10.13039/501100011033. The first author acknowledges support from the “Programa I+D+i de la Generalitat Valenciana” through grants ACIF/2019/042 and CIBEFP/2021/72. This work also draws on research supported by the Social Sciences and Humanities Research Council (895-2013-1012) and the Fonds de recherche du Québec-Société et Culture (2022-SE3-303927)

Repositorio Institucional de la Universidad de Alicante

Applying Automatic Translation for Optical Music Recognition’s Encoding Step

Author: Esplà-Gomis Miquel
Iñesta José M.
Ponce de León Amador Pedro José
Rizo David
Ríos-Vila Antonio
Publication venue: 'MDPI AG'
Publication date: 25/04/2021
Field of study

Optical music recognition is a research field whose efforts have been mainly focused, due to the difficulties involved in its processes, on document and image recognition. However, there is a final step after the recognition phase that has not been properly addressed or discussed, and which is relevant to obtaining a standard digital score from the recognition process: the step of encoding data into a standard file format. In this paper, we address this task by proposing and evaluating the feasibility of using machine translation techniques, using statistical approaches and neural systems, to automatically convert the results of graphical encoding recognition into a standard semantic format, which can be exported as a digital score. We also discuss the implications, challenges and details to be taken into account when applying machine translation techniques to music languages, which are very different from natural human languages. This needs to be addressed prior to performing experiments and has not been reported in previous works. We also describe and detail experimental results, and conclude that applying machine translation techniques is a suitable solution for this task, as they have proven to obtain robust results.This work was supported by the Spanish Ministry HISPAMUS project TIN2017-86576-R, partially funded by the EU, and by the Generalitat Valenciana through project GV/2020/030

Repositorio Institucional de la Universidad de Alicante

Staff-line detection and removal using a convolutional neural network

Author: Calvo-Zaragoza Jorge
Oncina Jose
Pertusa Antonio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Staff-line removal is an important preprocessing stage for most optical music recognition systems. Common procedures to solve this task involve image processing techniques. In contrast to these traditional methods based on hand-engineered transformations, the problem can also be approached as a classification task in which each pixel is labeled as either staff or symbol, so that only those that belong to symbols are kept in the image. In order to perform this classification, we propose the use of convolutional neural networks, which have demonstrated an outstanding performance in image retrieval tasks. The initial features of each pixel consist of a square patch from the input image centered at that pixel. The proposed network is trained by using a dataset which contains pairs of scores with and without the staff lines. Our results in both binary and grayscale images show that the proposed technique is very accurate, outperforming both other classifiers and the state-of-the-art strategies considered. In addition, several advantages of the presented methodology with respect to traditional procedures proposed so far are discussed.This work was supported by the Spanish Ministerio de Educación, Cultura y Deporte through a FPU Fellowship (Ref. AP2012–0939), the Spanish Ministerio de Economía y Competitividad through Project TIMuL (No. TIN2013-48152-C2-1-R supported by EU FEDER funds) and the Instituto Universitario de Investigación Informática (IUII) from the University of Alicante

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas