PhD Thesis.Audio effects modeling is the process of emulating an audio effect unit and seeks
to recreate the sound, behaviour and main perceptual features of an analog reference
device. Audio effect units are analog or digital signal processing systems
that transform certain characteristics of the sound source. These transformations
can be linear or nonlinear, time-invariant or time-varying and with short-term and
long-term memory. Most typical audio effect transformations are based on dynamics,
such as compression; tone such as distortion; frequency such as equalization;
and time such as artificial reverberation or modulation based audio effects.
The digital simulation of these audio processors is normally done by designing
mathematical models of these systems. This is often difficult because it seeks to
accurately model all components within the effect unit, which usually contains
mechanical elements together with nonlinear and time-varying analog electronics.
Most existing methods for audio effects modeling are either simplified or optimized
to a very specific circuit or type of audio effect and cannot be efficiently
translated to other types of audio effects.
This thesis aims to explore deep learning architectures for music signal processing
in the context of audio effects modeling. We investigate deep neural networks
as black-box modeling strategies to solve this task, i.e. by using only input-output
measurements. We propose different DSP-informed deep learning models to emulate
each type of audio effect transformations.
Through objective perceptual-based metrics and subjective listening tests we
explore the performance of these models when modeling various analog audio effects.
Also, we analyze how the given tasks are accomplished and what the models
are actually learning. We show virtual analog models of nonlinear effects, such as
a tube preamplifier; nonlinear effects with memory, such as a transistor-based limiter;
and electromechanical nonlinear time-varying effects, such as a Leslie speaker
cabinet and plate and spring reverberators.
We report that the proposed deep learning architectures represent an improvement
of the state-of-the-art in black-box modeling of audio effects and the respective
directions of future work are given