Search CORE

405 research outputs found

Differentiable Artificial Reverberation

Author: Choi Hyeong-Seok
Lee Kyogu
Lee Sungho
Publication venue
Publication date: 09/11/2021
Field of study

Artificial reverberation (AR) models play a central role in various audio applications. Therefore, estimating the AR model parameters (ARPs) of a target reverberation is a crucial task. Although a few recent deep-learning-based approaches have shown promising performance, their non-end-to-end training scheme prevents them from fully exploiting the potential of deep neural networks. This motivates to introduce differentiable artificial reverberation (DAR) models which allows loss gradients to be back-propagated end-to-end. However, implementing the AR models with their difference equations "as is" in the deep-learning framework severely bottlenecks the training speed when executed with a parallel processor like GPU due to their infinite impulse response (IIR) components. We tackle this problem by replacing the IIR filters with finite impulse response (FIR) approximations with the frequency-sampling method (FSM). Using the FSM, we implement three DAR models -- differentiable Filtered Velvet Noise (FVN), Advanced Filtered Velvet Noise (AFVN), and Feedback Delay Network (FDN). For each AR model, we train its ARP estimation networks for analysis-synthesis (RIR-to-ARP) and blind estimation (reverberant-speech-to-ARP) task in an end-to-end manner with its DAR model counterpart. Experiment results show that the proposed method achieves consistent performance improvement over the non-end-to-end approaches in both objective metrics and subjective listening test results.Comment: Manuscript submitted to TASL

arXiv.org e-Print Archive

Frequency domain variant of Velvet noise and its application to acoustic measurements

Author: Banno Hideki
Irino Toshio
Kawahara Hideki
Mizumachi Mitsunori
Morise Masanori
Sakakibara Ken-Ichi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/09/2019
Field of study

We propose a new family of test signals for acoustic measurements such as impulse response, nonlinearity, and the effects of background noise. The proposed family complements difficulties in existing families, the Swept-Sine (SS), pseudo-random noise such as the maximum length sequence (MLS). The proposed family uses the frequency domain variant of the Velvet noise (FVN) as its building block. An FVN is an impulse response of an all-pass filter and yields the unit impulse when convolved with the time-reversed version of itself. In this respect, FVN is a member of the time-stretched pulse (TSP) in the broadest sense. The high degree of freedom in designing an FVN opens a vast range of applications in acoustic measurement. We introduce the following applications and their specific procedures, among other possibilities. They are as follows. a) Spectrum shaping adaptive to background noise. b) Simultaneous measurement of impulse responses of multiple acoustic paths. d) Simultaneous measurement of linear and nonlinear components of an acoustic path. e) Automatic procedure for time axis alignment of the source and the receiver when they are using independent clocks in acoustic impulse response measurement. We implemented a reference measurement tool equipped with all these procedures. The MATLAB source code and related materials are open-sourced and placed in a GitHub repository.Comment: 10 pages, 14 figures, APSIPA ASC 2019. arXiv admin note: text overlap with arXiv:1806.0681

arXiv.org e-Print Archive

Crossref

Simultaneous Measurement of Multiple Acoustic Attributes Using Structured Periodic Test Signals Including Music and Other Sound Materials

Author: Kawahara Hideki
Kitamura Tatsuya
Mizumachi Mitsunori
Sakakibara Ken-Ichi
Yatabe Kohei
Publication venue
Publication date: 06/09/2023
Field of study

We introduce a general framework for measuring acoustic properties such as liner time-invariant (LTI) response, signal-dependent time-invariant (SDTI) component, and random and time-varying (RTV) component simultaneously using structured periodic test signals. The framework also enables music pieces and other sound materials as test signals by "safeguarding" them by adding slight deterministic "noise." Measurement using swept-sin, MLS (Maxim Length Sequence), and their variants are special cases of the proposed framework. We implemented interactive and real-time measuring tools based on this framework and made them open-source. Furthermore, we applied this framework to assess pitch extractors objectively.Comment: 8 pages, 17 figures, accepted for APSIPA ASC 202

arXiv.org e-Print Archive

A pilot study on discriminative power of features of superficial venous pattern in the hand

Author
Publication venue
Publication date
Field of study

The goal of the project is to develop an automatic way to identify, represent the superficial vasculature of the back hand and investigate its discriminative power as biometric feature. A prototype of a system that extracts the superficial venous pattern of infrared images of back hands will be described. Enhancement algorithms are used to solve the lack of contrast of the infrared images. To trace the veins, a vessel tracking technique is applied, obtaining binary masks of the superficial venous tree. Successively, a method to estimate the blood vessels calibre, length, the location and angles of vessel junctions, will be presented. The discriminative power of these features will be studied, independently and simultaneously, considering two features vector. Pattern matching of two vasculature maps will be performed, to investigate the uniqueness of the vessel network / L’obiettivo del progetto è di sviluppare un metodo automatico per identificare e rappresentare la rete vascolare superficiale presente nel dorso della mano ed investigare sul suo potere discriminativo come caratteristica biometrica. Un prototipo di sistema che estrae l’albero superficiale delle vene da immagini infrarosse del dorso della mano sarà descritto. Algoritmi per il miglioramento del contrasto delle immagini infrarosse saranno applicati. Per tracciare le vene, una tecnica di tracking verrà utilizzata per ottenere una maschera binaria della rete vascolare. Successivamente, un metodo per stimare il calibro e la lunghezza dei vasi sanguigni, la posizione e gli angoli delle giunzioni sarà trattato. Il potere discriminativo delle precedenti caratteristiche verrà studiato ed una tecnica di pattern matching di due modelli vascolari sarà presentata per verificare l’unicità di quest

Padua Thesis and Dissertation Archive

Proceedings of the 7th Sound and Music Computing Conference

Author: Emilia Gómez
Perfecto Herrera
Rafael Ramirez
Publication venue: SMC Network
Publication date: 25/07/2010
Field of study

Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

ZENODO

Music Production Behaviour Modelling

Author: Colonel J
Publication venue
Publication date: 27/06/2023
Field of study

The new millennium has seen an explosion of computational approaches to the study of music production, due in part to the decreasing cost of computation and the increase of digital music production techniques. The rise of digital recording equipment, MIDI, digital audio workstations (DAWs), and software plugins for audio effects led to the digital capture of various processes in music production. This discretization of traditionally analogue methods allowed for the development of intelligent music production, which uses machine learning to numerically characterize and automate portions of the music production process. One algorithm from the field referred to as ``reverse engineering a multitrack mix'' can recover the audio effects processing used to transform a multitrack recording into a mixdown in the absence of information about how the mixdown was achieved. This thesis improves on this method of reverse engineering a mix by leveraging recent advancements in machine learning for audio. Using the differentiable digital signal processing paradigm, greybox modules for gain, panning, equalisation, artificial reverberation, memoryless waveshaping distortion, and dynamic range compression are presented. These modules are then connected in a mixing chain and are optimized to learn the effects used in a given mixdown. Both objective and perceptual metrics are presented to measure the performance of these various modules in isolation and within a full mixing chain. Ultimately a fully differentiable mixing chain is presented that outperforms previously proposed methods to reverse engineer a mix. Directions for future work are proposed to improve characterization of multitrack mixing behaviours

Queen Mary Research Online

Deep Learning for Audio Effects Modeling

Author: Martínez Ramírez Marco A
Publication venue: 'Queen Mary University of London'
Publication date: 29/01/2021
Field of study

PhD Thesis.Audio effects modeling is the process of emulating an audio effect unit and seeks to recreate the sound, behaviour and main perceptual features of an analog reference device. Audio effect units are analog or digital signal processing systems that transform certain characteristics of the sound source. These transformations can be linear or nonlinear, time-invariant or time-varying and with short-term and long-term memory. Most typical audio effect transformations are based on dynamics, such as compression; tone such as distortion; frequency such as equalization; and time such as artificial reverberation or modulation based audio effects. The digital simulation of these audio processors is normally done by designing mathematical models of these systems. This is often difficult because it seeks to accurately model all components within the effect unit, which usually contains mechanical elements together with nonlinear and time-varying analog electronics. Most existing methods for audio effects modeling are either simplified or optimized to a very specific circuit or type of audio effect and cannot be efficiently translated to other types of audio effects. This thesis aims to explore deep learning architectures for music signal processing in the context of audio effects modeling. We investigate deep neural networks as black-box modeling strategies to solve this task, i.e. by using only input-output measurements. We propose different DSP-informed deep learning models to emulate each type of audio effect transformations. Through objective perceptual-based metrics and subjective listening tests we explore the performance of these models when modeling various analog audio effects. Also, we analyze how the given tasks are accomplished and what the models are actually learning. We show virtual analog models of nonlinear effects, such as a tube preamplifier; nonlinear effects with memory, such as a transistor-based limiter; and electromechanical nonlinear time-varying effects, such as a Leslie speaker cabinet and plate and spring reverberators. We report that the proposed deep learning architectures represent an improvement of the state-of-the-art in black-box modeling of audio effects and the respective directions of future work are given

Queen Mary Research Online

Proceedings of the Scientific-Practical Conference "Research and Development - 2016"

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2020
Field of study

talent management; sensor arrays; automatic speech recognition; dry separation technology; oil production; oil waste; laser technolog

Directory of Open Access Books (DOAB)