6,320 research outputs found
Optical Music Recognition with Convolutional Sequence-to-Sequence Models
Optical Music Recognition (OMR) is an important technology within Music
Information Retrieval. Deep learning models show promising results on OMR
tasks, but symbol-level annotated data sets of sufficient size to train such
models are not available and difficult to develop. We present a deep learning
architecture called a Convolutional Sequence-to-Sequence model to both move
towards an end-to-end trainable OMR pipeline, and apply a learning process that
trains on full sentences of sheet music instead of individually labeled
symbols. The model is trained and evaluated on a human generated data set, with
various image augmentations based on real-world scenarios. This data set is the
first publicly available set in OMR research with sufficient size to train and
evaluate deep learning models. With the introduced augmentations a pitch
recognition accuracy of 81% and a duration accuracy of 94% is achieved,
resulting in a note level accuracy of 80%. Finally, the model is compared to
commercially available methods, showing a large improvements over these
applications.Comment: ISMIR 201
Optical Music Recognition: State of the Art and Major Challenges
Optical Music Recognition (OMR) is concerned with transcribing sheet music into a machine-readable format. The transcribed copy should allow musicians to compose, play and edit music by taking a picture of a music sheet. Complete transcription of sheet music would also enable more efficient archival. OMR facilitates examining sheet music statistically or searching for patterns of notations, thus helping use cases in digital musicology too. Recently, there has been a shift in OMR from using conventional computer vision techniques towards a deep learning approach. In this paper, we review relevant works in OMR, including fundamental methods and significant outcomes, and highlight different stages of the OMR pipeline. These stages often lack standard input and output representation and standardised evaluation. Therefore, comparing different approaches and evaluating the impact of different processing methods can become rather complex. This paper provides recommendations for future work, addressing some of the highlighted issues and represents a position in furthering this important field of research
Deep watershed detector for music object recognition
Optical Music Recognition (OMR) is an important and challenging area within music information retrieval, the accurate detection of music symbols in digital images is a core functionality of any OMR pipeline. In this paper, we introduce a novel object detection method, based on synthetic energy maps and the watershed transform, called Deep Watershed Detector (DWD). Our method is specifically tailored to deal with high resolution images that contain a large number of very small objects and is therefore able to process full pages of written music. We present state-of-the-art detection results of common music symbols and show DWD’s ability to work with synthetic scores equally well as on handwritten music
Handwritten Music Recognition for Mensural notation with convolutional recurrent neural networks
[EN] Optical Music Recognition is the technology that allows computers to read music notation, which is also referred to as Handwritten Music Recognition when it is applied over handwritten notation. This technology aims at efficiently transcribing written music into a representation that can be further processed by a computer. This is of special interest to transcribe the large amount of music written in early notations, such as the Mensural notation, since they represent largely unexplored heritage for the musicological community. Traditional approaches to this problem are based on complex strategies with many explicit rules that only work for one particular type of manuscript. Machine learning approaches offer the promise of generalizable solutions, based on learning from just labelled examples. However, previous research has not achieved sufficiently acceptable results for handwritten Mensural notation. In this work we propose the use of deep neural networks, namely convolutional recurrent neural networks, which have proved effective in other similar domains such as handwritten text recognition. Our experimental results achieve, for the first time, recognition results that can be considered effective for transcribing handwritten Mensural notation, decreasing the symbol-level error rate of previous approaches from 25.7% to 7.0%. (C) 2019 Elsevier B.V. All rights reserved.First author thanks the support from the Spanish Ministry "HISPAMUS" project (TIN2017-86576-R), partially funded by the EU. The other authors were supported by the European Union's H2020 grant "Recognition and Enrichment of Archival Documents" (Ref. 674943), by the BBVA Foundacion through the 2017-2018 and 2018-2019 Digital Humanities research grants "Carabela" and "HistWeather - Dos Siglos de Datos Cilmaticos", and by EU JPICH project "HOME - History Of Medieval Europe"(Spanish PEICTI Ref. PCI2018-093122).Calvo-Zaragoza, J.; Toselli, AH.; Vidal, E. (2019). Handwritten Music Recognition for Mensural notation with convolutional recurrent neural networks. Pattern Recognition Letters. 128:115-121. https://doi.org/10.1016/j.patrec.2019.08.021S11512112
End-to-End Neural Optical Music Recognition of Monophonic Scores
[EN] Optical Music Recognition is a field of research that investigates how to computationally decode music notation from images. Despite the efforts made so far, there are hardly any complete solutions to the problem. In this work, we study the use of neural networks that work in an end-to-end manner. This is achieved by using a neural model that combines the capabilities of convolutional neural networks, which work on the input image, and recurrent neural networks, which deal with the sequential nature of the problem. Thanks to the use of the the so-called Connectionist Temporal Classification loss function, these models can be directly trained from input images accompanied by their corresponding transcripts into music symbol sequences. We also present the Printed Images of Music Staves (PrIMuS) dataset, containing more than 80,000 monodic single-staff real scores in common western notation, that is used to train and evaluate the neural approach. In our experiments, it is demonstrated that this formulation can be carried out successfully. Additionally, we study several considerations about the codification of the output musical sequences, the convergence and scalability of the neural models, as well as the ability of this approach to locate symbols in the input score.This work was supported by the Social Sciences and Humanities Research Council of Canada, and the Spanish Ministerio de Economia y Competitividad through Project HISPAMUS Ref. No. TIN2017-86576-R (supported by UE FEDER funds).Calvo-Zaragoza, J.; Rizo, D. (2018). End-to-End Neural Optical Music Recognition of Monophonic Scores. Applied Sciences. 8(4). https://doi.org/10.3390/app8040606S8
Proceedings of the 4th International Workshop on Reading Music Systems
The International Workshop on Reading Music Systems (WoRMS) is a workshop
that tries to connect researchers who develop systems for reading music, such
as in the field of Optical Music Recognition, with other researchers and
practitioners that could benefit from such systems, like librarians or
musicologists.
The relevant topics of interest for the workshop include, but are not limited
to: Music reading systems; Optical music recognition; Datasets and performance
evaluation; Image processing on music scores; Writer identification; Authoring,
editing, storing and presentation systems for music scores; Multi-modal
systems; Novel input-methods for music to produce written music; Web-based
Music Information Retrieval services; Applications and projects; Use-cases
related to written music.
These are the proceedings of the 4th International Workshop on Reading Music
Systems, held online on Nov. 18th 2022.Comment: Proceedings edited by Jorge Calvo-Zaragoza, Alexander Pacha and Elona
Shatr
Understanding Optical Music Recognition
For over 50 years, researchers have been trying to teach computers to read music notation, referred to as Optical Music Recognition (OMR). However, this field is still difficult to access for new researchers, especially those without a significant musical background: Few introductory materials are available, and, furthermore, the field has struggled with defining itself and building a shared terminology. In this work, we address these shortcomings by (1) providing a robust definition of OMR and its relationship to related fields, (2) analyzing how OMR inverts the music encoding process to recover the musical notation and the musical semantics from documents, and (3) proposing a taxonomy of OMR, with most notably a novel taxonomy of applications. Additionally, we discuss how deep learning affects modern OMR research, as opposed to the traditional pipeline. Based on this work, the reader should be able to attain a basic understanding of OMR: its objectives, its inherent structure, its relationship to other fields, the state of the art, and the research opportunities it affords
- …