5 research outputs found

    Combining Image Processing Techniques, OCR, and OMR for the Digitization of Musical Books

    Full text link
    Digitizing historical music books can be challenging sincestaves are usually mixed with typewritten text explaining some charac-teristics of them. In this work, we propose a new methodology to under-take such a digitization task. After scanning the pages of the book, thedifferent blocks of text and staves can be detected and organized intomusic pieces using image processing techniques. Then, OCR and OMRmethods can be applied to text and stave blocks, respectively, and theinformation conveniently stored using the MusicXML format. In addi-tion, we explain how this methodology was successfully applied in thedigitization of a book entitled The Music in the Santo Domingos Cathe-dral. In particular, we provide a new annotated database of musicalsymbols from the staves included in this book. This database was usedto develop two new OMR deep learning models for the detection andclassification of music scores. The detection model obtained a F1-scoreof 90% on symbol detection; and the classification model a note pitchaccuracy of 98.4%. The method allows us to conduct text searches, obtainclean PDF files of music pieces, or reproduce the sound represented bythe pieces. The database, models, and code of this project are availableat https://github.com/joheras/MusicaCatedralStoDomingoIE