320 research outputs found
Temporal overdrive recurrent neural network
In this work we present a novel recurrent neural network architecture designed to model systems characterized by multiple characteristic timescales in their dynamics. The proposed network is composed by several recurrent groups of neurons that are trained to separately adapt to each timescale, in order to improve the system identification process. We test our framework on time series prediction tasks and we show some promising, preliminary results achieved on synthetic data. To evaluate the capabilities of our network, we compare the performance with several state-of-the-art recurrent architectures
Blind extraction of guitar effects through blind system inversion and neural guitar effect modeling
Audio effects are an ubiquitous tool in music production due to the interesting ways in which they can shape the sound of music. Guitar effects, the subset of all audio effects focusing on guitar signals, are commonly used in popular music to shape the guitar sound to fit specific genres or to create more variety within musical compositions. Automatic extraction of guitar effects and their parameter settings, with the aim to copy a target guitar sound, has been previously investigated, where artificial neural networks first determine the effect class of a reference signal and subsequently the parameter settings. These approaches require a corresponding guitar effect implementation to be available. In general, for very close sound matching, additional research regarding effect implementations is necessary. In this work, we present a different approach to circumvent these issues. We propose blind extraction of guitar effects through a combination of blind system inversion and neural guitar effect modeling. That way, an immediately usable, blind copy of the target guitar effect is obtained. The proposed method is tested with the phaser, softclipping and slapback delay effect. Listening tests with eight subjects indicate excellent quality of the blind copies, i.e., little to no difference to the reference guitar effect
Deep Learning for Black-Box Modeling of Audio Effects
Virtual analog modeling of audio effects consists of emulating the sound of an audio processor reference device. This digital simulation is normally done by designing mathematical models of these systems. It is often difficult because it seeks to accurately model all components within the effect unit, which usually contains various nonlinearities and time-varying components. Most existing methods for audio effects modeling are either simplified or optimized to a very specific circuit or type of audio effect and cannot be efficiently translated to other types of audio effects. Recently, deep neural networks have been explored as black-box modeling strategies to solve this task, i.e., by using only input–output measurements. We analyse different state-of-the-art deep learning models based on convolutional and recurrent neural networks, feedforward WaveNet architectures and we also introduce a new model based on the combination of the aforementioned models. Through objective perceptual-based metrics and subjective listening tests we explore the performance of these models when modeling various analog audio effects. Thus, we show virtual analog models of nonlinear effects, such as a tube preamplifier; nonlinear effects with memory, such as a transistor-based limiter and nonlinear time-varying effects, such as the rotating horn and rotating woofer of a Leslie speaker cabinet
Deep Learning for Audio Effects Modeling
PhD Thesis.Audio effects modeling is the process of emulating an audio effect unit and seeks
to recreate the sound, behaviour and main perceptual features of an analog reference
device. Audio effect units are analog or digital signal processing systems
that transform certain characteristics of the sound source. These transformations
can be linear or nonlinear, time-invariant or time-varying and with short-term and
long-term memory. Most typical audio effect transformations are based on dynamics,
such as compression; tone such as distortion; frequency such as equalization;
and time such as artificial reverberation or modulation based audio effects.
The digital simulation of these audio processors is normally done by designing
mathematical models of these systems. This is often difficult because it seeks to
accurately model all components within the effect unit, which usually contains
mechanical elements together with nonlinear and time-varying analog electronics.
Most existing methods for audio effects modeling are either simplified or optimized
to a very specific circuit or type of audio effect and cannot be efficiently
translated to other types of audio effects.
This thesis aims to explore deep learning architectures for music signal processing
in the context of audio effects modeling. We investigate deep neural networks
as black-box modeling strategies to solve this task, i.e. by using only input-output
measurements. We propose different DSP-informed deep learning models to emulate
each type of audio effect transformations.
Through objective perceptual-based metrics and subjective listening tests we
explore the performance of these models when modeling various analog audio effects.
Also, we analyze how the given tasks are accomplished and what the models
are actually learning. We show virtual analog models of nonlinear effects, such as
a tube preamplifier; nonlinear effects with memory, such as a transistor-based limiter;
and electromechanical nonlinear time-varying effects, such as a Leslie speaker
cabinet and plate and spring reverberators.
We report that the proposed deep learning architectures represent an improvement
of the state-of-the-art in black-box modeling of audio effects and the respective
directions of future work are given
A general-purpose deep learning approach to model time-varying audio effects
Audio processors whose parameters are modified periodically over time are often referred as time-varying or modulation based audio effects. Most existing methods for modeling these type of effect units are often optimized to a very specific circuit and cannot be efficiently generalized to other time-varying effects. Based on convolutional and recurrent neural networks, we propose a deep learning architecture for generic black-box modeling of audio processors with long-term memory. We explore the capabilities of deep neural networks to learn such long temporal dependencies and we show the network modeling various linear and nonlinear, time-varying and time-invariant audio effects. In order to measure the performance of the model, we propose an objective metric based on the psychoacoustics of modulation frequency perception. We also analyze what the model is actually learning and how the given task is accomplished
Audio signal modelling using neural networks
NeuronovĂ© sĂtÄ› vycházejĂcĂ z architektury WaveNet a sĂtÄ› vyuĹľĂvajĂcĂ rekurentnĂ vrstvy jsou v souÄŤasnosti pouĹľĂvány jak pro syntĂ©zu lidskĂ© Ĺ™eÄŤi, tak pro „black box“ modelovánĂ systĂ©mĹŻ pro Ăşpravu akustickĂ©ho signálu – modulaÄŤnĂ efekty, nelineárnĂ zkreslovaÄŤe apod. Ăškolem studenta bude shrnout dosavadnĂ poznatky o moĹľnostech vyuĹľitĂ neuronovĂ˝ch sĂtĂ pĹ™i modelovánĂ akustickĂ˝ch signálĹŻ. Student dále implementuje nÄ›kterĂ˝ z modelĹŻ neuronovĂ˝ch sĂtĂ v programovacĂm jazyce Python a vyuĹľije jej pro natrĂ©novánĂ a následnou simulaci libovolnĂ©ho efektu nebo systĂ©mu pro Ăşpravu akustickĂ©ho signálu. V rámci semestrálnĂ práce vypracujte teoretickou část práce, vytvoĹ™te zvukovou databázi pro trĂ©novánĂ neuronovĂ© sĂtÄ› a implementujte jednu ze struktur sĂtĂ pro modelovánĂ zvukovĂ©ho signálu. NeuronovĂ© sĂtÄ› jsou v prĹŻbÄ›hu poslednĂch let pouĹľĂvány stále vĂce, a to vĂcemĂ©nÄ› pĹ™es celĂ© spektrum vÄ›dnĂch oborĹŻ. NeuronovĂ© sĂtÄ› zaloĹľenĂ© na architektuĹ™e WaveNet a sĂtÄ› vyuĹľĂvajĂcĂ rekurentnĂch vrstev se v souÄŤasnĂ© dobÄ› pouĹľĂvajĂ v celĂ© Ĺ™adÄ› vyuĹľitĂ, zahrnujĂcĂ napĹ™Ăklad syntĂ©zu lidskĂ© Ĺ™eÄŤi, nebo napĹ™klad pĹ™i metodÄ› "black-box" modelovánĂ akustickĂ˝ch systĂ©mĹŻ, kterĂ© upravujĂ zvukovĂ˝ signál (modulaÄŤĂ efekty, nelineárnĂ zkreslovaÄŤe, apod.). Tato akademická práce si dává za cĂl poskytnout Ăşvod do problematiky neuronovĂ˝ch sĂtĂ, vysvÄ›tlit základnĂ pojmy a mechanismy tĂ©to problematiky. Popsat vyuĹľitĂ neuronovĂ˝ch sĂtĂ v modelovánĂ akustickĂ˝ch systĂ©mĹŻ a vyuĹľĂt tÄ›chto poznatkĹŻ k implementaci neuronovĂ˝ch sĂtĂ za cĂlem modelovánĂ libovolnĂ©ho efektu nebo zaĹ™ĂzenĂ pro Ăşpravu zvukovĂ©ho signálu.Neural networks based upon the WaveNet architecture and recurrent neural networks are nowadays used in human speech synthesis and other various tasks such as "black-box" modeling systems for acoustic signals alteration (modulation effects, non-linear distortion units, etc.). This work aims, to sum up existing methods of neural network use in acoustic signal modeling. Next, the student is to implement chosen model of neuron network Python and will train this architecture to perform a simulation of desirable sound effect or acoustic alteration system. The task for this semester is, to sum up existing knowledge concerning neural networks. Training database of sound samples and implementation of a sound modeling neural net is to be created as well. Through recent years, neural networks have been used more and more extensively across many science fields. Neural networks based upon the WaveNet architecture and recurrent neural networks are nowadays used in human speech synthesis and other various tasks such as "black-box" modeling systems for acoustic signals alteration (modulation effects, non-linear distortion units, etc.). This academic work provides a brief introduction to the neural network terminology and common practice, elaborates on several types of neural network types, the main focus on DeepMind's WaveNet. Furthermore describes and compares results of experimental implementation of WaveNet and other types of neural network in audio signal "black-box" modeling tasks.
- …