3 research outputs found
Towards Reliable Real-time Opera Tracking: Combining Alignment with Audio Event Detectors to Increase Robustness
Recent advances in real-time music score following have made it possible for
machines to automatically track highly complex polyphonic music, including full
orchestra performances. In this paper, we attempt to take this to an even
higher level, namely, live tracking of full operas. We first apply a
state-of-the-art audio alignment method based on online Dynamic Time-Warping
(OLTW) to full-length recordings of a Mozart opera and, analyzing the tracker's
most severe errors, identify three common sources of problems specific to the
opera scenario. To address these, we propose a combination of a DTW-based music
tracker with specialized audio event detectors (for applause, silence/noise,
and speech) that condition the DTW algorithm in a top-down fashion, and show,
step by step, how these detectors add robustness to the score follower.
However, there remain a number of open problems which we identify as targets
for ongoing and future research.Comment: 7 pages, 4 figures, In Proceedings of the 17th Sound and Music
Computing Conference (SMC 2020), Torino, Ital
MIDI-Sheet Music Alignment Using Bootleg Score Synthesis
MIDI-sheet music alignment is the task of finding correspondences between a
MIDI representation of a piece and its corresponding sheet music images. Rather
than using optical music recognition to bridge the gap between sheet music and
MIDI, we explore an alternative approach: projecting the MIDI data into pixel
space and performing alignment in the image domain. Our method converts the
MIDI data into a crude representation of the score that only contains
rectangular floating notehead blobs, a process we call bootleg score synthesis.
Furthermore, we project sheet music images into the same bootleg space by
applying a deep watershed notehead detector and filling in the bounding boxes
around each detected notehead. Finally, we align the bootleg representations
using a simple variant of dynamic time warping. On a dataset of 68 real scanned
piano scores from IMSLP and corresponding MIDI performances, our method
achieves a 97.3% accuracy at an error tolerance of one second, outperforming
several baseline systems that employ optical music recognition.Comment: 8 pages, 6 figures, 1 table. Accepted paper at the International
Society for Music Information Retrieval Conference (ISMIR) 201
Markov-switching State Space Models for Uncovering Musical Interpretation
For concertgoers, musical interpretation is the most important factor in
determining whether or not we enjoy a classical performance. Every performance
includes mistakes---intonation issues, a lost note, an unpleasant sound---but
these are all easily forgotten (or unnoticed) when a performer engages her
audience, imbuing a piece with novel emotional content beyond the vague
instructions inscribed on the printed page. While music teachers use imagery or
heuristic guidelines to motivate interpretive decisions, combining these vague
instructions to create a convincing performance remains the domain of the
performer, subject to the whims of the moment, technical fluency, and taste. In
this research, we use data from the CHARM Mazurka Project---forty-six
professional recordings of Chopin's Mazurka Op. 63 No. 3 by consumate
artists---with the goal of elucidating musically interpretable performance
decisions. Using information on the inter-onset intervals of the note attacks
in the recordings, we apply functional data analysis techniques enriched with
prior information gained from music theory to discover relevant features and
perform hierarchical clustering. The resulting clusters suggest methods for
informing music instruction, discovering listening preferences, and analyzing
performances.Comment: 33 pages, 21 figure