Search CORE

3,426 research outputs found

Modelling of Sound Events with Hidden Imbalances Based on Clustering and Separate Sub-Dictionary Learning

Author: Komatsu Tatsuya
Kondo Reishi
Narisetty Chaitanya
Publication venue
Publication date: 04/04/2019
Field of study

This paper proposes an effective modelling of sound event spectra with a hidden data-size-imbalance, for improved Acoustic Event Detection (AED). The proposed method models each event as an aggregated representation of a few latent factors, while conventional approaches try to find acoustic elements directly from the event spectra. In the method, all the latent factors across all events are assigned comparable importance and complexity to overcome the hidden imbalance of data-sizes in event spectra. To extract latent factors in each event, the proposed method employs clustering and performs non-negative matrix factorization to each latent factor, and learns its acoustic elements as a sub-dictionary. Separate sub-dictionary learning effectively models the acoustic elements with limited data-sizes and avoids over-fitting due to hidden imbalances in training data. For the task of polyphonic sound event detection from DCASE 2013 challenge, an AED based on the proposed modelling achieves a detection F-measure of 46.5%, a significant improvement of more than 19% as compared to the existing state-of-the-art methods

arXiv.org e-Print Archive

Crossref

Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience.

Author: Bahle Andrew H
Denisenko Natalia I
Fee Michale S
Goldman Mark S
Gu Shijie
Mackevicius Emily L
Williams Alex H
Publication venue: eScholarship, University of California
Publication date: 01/02/2019
Field of study

Identifying low-dimensional features that describe large-scale neural recordings is a major challenge in neuroscience. Repeated temporal patterns (sequences) are thought to be a salient feature of neural dynamics, but are not succinctly captured by traditional dimensionality reduction techniques. Here, we describe a software toolbox-called seqNMF-with new methods for extracting informative, non-redundant, sequences from high-dimensional neural data, testing the significance of these extracted patterns, and assessing the prevalence of sequential structure in data. We test these methods on simulated data under multiple noise conditions, and on several real neural and behavioral datas. In hippocampal data, seqNMF identifies neural sequences that match those calculated manually by reference to behavioral events. In songbird data, seqNMF discovers neural sequences in untutored birds that lack stereotyped songs. Thus, by identifying temporal structure directly from neural data, seqNMF enables dissection of complex neural circuits without relying on temporal references from stimuli or behavioral outputs

DSpace@MIT

eScholarship - University of California

Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection

Author: Heittola Toni
Huttunen Heikki
Parascandolo Giambattista
Virtanen Tuomas
Çakır Emre
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/02/2017
Field of study

Sound events often occur in unstructured environments where they exhibit wide variations in their frequency content and temporal structure. Convolutional neural networks (CNN) are able to extract higher level features that are invariant to local spectral and temporal variations. Recurrent neural networks (RNNs) are powerful in learning the longer term temporal context in the audio signals. CNNs and RNNs as classifiers have recently shown improved performances over established methods in various sound recognition tasks. We combine these two approaches in a Convolutional Recurrent Neural Network (CRNN) and apply it on a polyphonic sound event detection task. We compare the performance of the proposed CRNN method with CNN, RNN, and other established methods, and observe a considerable improvement for four different datasets consisting of everyday sound events.Comment: Accepted for IEEE Transactions on Audio, Speech and Language Processing, Special Issue on Sound Scene and Event Analysi

arXiv.org e-Print Archive

Trepo - Institutional Repository of Tampere University

Polyphonic Sound Event Detection by using Capsule Neural Networks

Author: Gabrielli Leonardo
Principi Emanuele
Squartini Stefano
Vesperini Fabio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/01/2019
Field of study

Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings. Nowadays, Deep Learning offers valuable techniques for this goal such as Convolutional Neural Networks (CNNs). The Capsule Neural Network (CapsNet) architecture has been recently introduced in the image processing field with the intent to overcome some of the known limitations of CNNs, specifically regarding the scarce robustness to affine transformations (i.e., perspective, size, orientation) and the detection of overlapped images. This motivated the authors to employ CapsNets to deal with the polyphonic-SED task, in which multiple sound events occur simultaneously. Specifically, we propose to exploit the capsule units to represent a set of distinctive properties for each individual sound event. Capsule units are connected through a so-called "dynamic routing" that encourages learning part-whole relationships and improves the detection performance in a polyphonic context. This paper reports extensive evaluations carried out on three publicly available datasets, showing how the CapsNet-based algorithm not only outperforms standard CNNs but also allows to achieve the best results with respect to the state of the art algorithms

arXiv.org e-Print Archive

IRIS UniversitÃ Politecnica delle Marche

Recommended from our members

Adaptive Noise Reduction for Sound Event Detection Using Subband-Weighted NMF

Author: Emmanouil Benetos
Févotte
Qing Zhou
Virtanen
Zuren Feng
Publication venue: 'MDPI AG'
Publication date: 12/07/2019
Field of study

Sound event detection in real-world environments suffers from the interference of non-stationary and time-varying noise. This paper presents an adaptive noise reduction method for sound event detection based on non-negative matrix factorization (NMF). First, a scheme for noise dictionary learning from the input noisy signal is employed by the technique of robust NMF, which supports adaptation to noise variations. The estimated noise dictionary is used to develop a supervised source separation framework in combination with a pre-trained event dictionary. Second, to improve the separation quality, we extend the basic NMF model to a weighted form, with the aim of varying the relative importance of the different components when separating a target sound event from noise. With properly designed weights, the separation process is forced to rely more on those dominant event components, whereas the noise gets greatly suppressed. The proposed method is evaluated on a dataset of the rare sound event detection task of the DCASE 2017 challenge, and achieves comparable results to the top-ranking system based on convolutional recurrent neural networks (CRNNs). The proposed weighted NMF method shows an excellent noise reduction ability, and achieves an improvement of an F-score by 5%, compared to the unweighted approach

City Research Online

Crossref

Queen Mary Research Online