Playing techniques are expressive elements in music performances that
carry important information about music expressivity and interpretation.
When displaying playing techniques in the time–frequency domain, we
observe that each has a distinctive spectro-temporal pattern. Based on
the patterns of regularity, we group commonly-used playing techniques
into two families: pitch modulation-based techniques (PMTs) and pitch
evolution-based techniques (PETs). The former are periodic modulations
that elaborate on stable pitches, including vibrato, tremolo, trill, and
flutter-tongue; while the latter contain monotonic pitch changes, such
as acciaccatura, portamento, and glissando.
In this thesis, we present a general framework based on the scattering transform for playing technique recognition. We propose two
variants of the scattering transform, the adaptive scattering and the
direction-invariant joint scattering. The former provides highly-compact
representations that are invariant to pitch transpositions for representing PMTs. The latter captures the spectro-temporal patterns exhibited
by PETs. Using the proposed scattering representations as input, our
recognition system achieves start-of-the-art results. We provide a formal
interpretation of the role of each scattering component confirmed by
explanatory visualisations.
Whereas previously published datasets for playing technique analysis
focused primarily on techniques recorded in isolation, we publicly release
a new dataset to evaluate the proposed framework. The dataset, named
CBFdataset, is the first dataset on the Chinese bamboo flute (CBF),
containing full-length CBF performances and expert annotations of
playing techniques. To provide evidence on the generalisability of the
proposed framework, we test it over three additional datasets with a
variety of playing techniques. Finally, to explore the applicability of
the proposed scattering representations to general audio classification
problems, we introduce two additional applications: one applies the
adaptive scattering for identifying performers in polyphonic orchestral
music and the other uses the joint scattering for detecting and classifying
chick calls