Search CORE

979 research outputs found

PLXTRM : prediction-led eXtended-guitar tool for real-time music applications and live performance

Author: Bernardini Bressan Federica
Degrave Jonas
Leman Marc
Nijs Luc
Vets Tim
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2017
Field of study

Acoustic event detection for multiple overlapping similar sources

Author: Clayton David
Stowell Dan
Publication venue
Publication date: 09/07/2015
Field of study

Many current paradigms for acoustic event detection (AED) are not adapted to the organic variability of natural sounds, and/or they assume a limit on the number of simultaneous sources: often only one source, or one source of each type, may be active. These aspects are highly undesirable for applications such as bird population monitoring. We introduce a simple method modelling the onsets, durations and offsets of acoustic events to avoid intrinsic limits on polyphony or on inter-event temporal patterns. We evaluate the method in a case study with over 3000 zebra finch calls. In comparison against a HMM-based method we find it more accurate at recovering acoustic events, and more robust for estimating calling rates.Comment: Accepted for WASPAA 201

arXiv.org e-Print Archive

Crossref

Real time click detection for Cardiopulmonary Resuscitation (CPR) Training with MiniAnne

Author: Prava Thapa
Publication venue: 'Saint Louis University'
Publication date: 01/01/2022
Field of study

NORA - Norwegian Open Research Archives

UiS Brage

A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection

Author: Du Jun
Lee Chin-Hui
Ma Feng
Pan Jia
Wang Qing
Wu Hua-Xin
Publication venue
Publication date: 08/01/2021
Field of study

In this paper, we propose a novel four-stage data augmentation approach to ResNet-Conformer based acoustic modeling for sound event localization and detection (SELD). First, we explore two spatial augmentation techniques, namely audio channel swapping (ACS) and multi-channel simulation (MCS), to deal with data sparsity in SELD. ACS and MDS focus on augmenting the limited training data with expanding direction of arrival (DOA) representations such that the acoustic models trained with the augmented data are robust to localization variations of acoustic sources. Next, time-domain mixing (TDM) and time-frequency masking (TFM) are also investigated to deal with overlapping sound events and data diversity. Finally, ACS, MCS, TDM and TFM are combined in a step-by-step manner to form an effective four-stage data augmentation scheme. Tested on the Detection and Classification of Acoustic Scenes and Events (DCASE) 2020 data sets, our proposed augmentation approach greatly improves the system performance, ranking our submitted system in the first place in the SELD task of DCASE 2020 Challenge. Furthermore, we employ a ResNet-Conformer architecture to model both global and local context dependencies of an audio sequence to yield further gains over those architectures used in the DCASE 2020 SELD evaluations.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive

Augmentation Methods on Monophonic Audio for Instrument Classification in Polyphonic Music

Author: bittner
bosch
diment
gururani
gururani
heittola
hung
kitahara
lee
mcfee
mcfee
pati
rafael aguilar
schoerkhuber
simonyan
taenzer
thickstun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/03/2020
Field of study

Instrument classification is one of the fields in Music Information Retrieval (MIR) that has attracted a lot of research interest. However, the majority of that is dealing with monophonic music, while efforts on polyphonic material mainly focus on predominant instrument recognition. In this paper, we propose an approach for instrument classification in polyphonic music from purely monophonic data, that involves performing data augmentation by mixing different audio segments. A variety of data augmentation techniques focusing on different sonic aspects, such as overlaying audio segments of the same genre, as well as pitch and tempo-based synchronization, are explored. We utilize Convolutional Neural Networks for the classification task, comparing shallow to deep network architectures. We further investigate the usage of a combination of the above classifiers, each trained on a single augmented dataset. An ensemble of VGG-like classifiers, trained on non-augmented, pitch-synchronized, tempo-synchronized and genre-similar excerpts, respectively, yields the best results, achieving slightly above 80% in terms of label ranking average precision (LRAP) in the IRMAS test set.ruments in over 2300 testing tracks

arXiv.org e-Print Archive

Crossref