Scattering Transform for Playing Technique Recognition

Abstract

Playing techniques are expressive elements in music performances that carry important information about music expressivity and interpretation. When displaying playing techniques in the time–frequency domain, we observe that each has a distinctive spectro-temporal pattern. Based on the patterns of regularity, we group commonly-used playing techniques into two families: pitch modulation-based techniques (PMTs) and pitch evolution-based techniques (PETs). The former are periodic modulations that elaborate on stable pitches, including vibrato, tremolo, trill, and flutter-tongue; while the latter contain monotonic pitch changes, such as acciaccatura, portamento, and glissando. In this thesis, we present a general framework based on the scattering transform for playing technique recognition. We propose two variants of the scattering transform, the adaptive scattering and the direction-invariant joint scattering. The former provides highly-compact representations that are invariant to pitch transpositions for representing PMTs. The latter captures the spectro-temporal patterns exhibited by PETs. Using the proposed scattering representations as input, our recognition system achieves start-of-the-art results. We provide a formal interpretation of the role of each scattering component confirmed by explanatory visualisations. Whereas previously published datasets for playing technique analysis focused primarily on techniques recorded in isolation, we publicly release a new dataset to evaluate the proposed framework. The dataset, named CBFdataset, is the first dataset on the Chinese bamboo flute (CBF), containing full-length CBF performances and expert annotations of playing techniques. To provide evidence on the generalisability of the proposed framework, we test it over three additional datasets with a variety of playing techniques. Finally, to explore the applicability of the proposed scattering representations to general audio classification problems, we introduce two additional applications: one applies the adaptive scattering for identifying performers in polyphonic orchestral music and the other uses the joint scattering for detecting and classifying chick calls

    Similar works