Search CORE

Recommended from our members

A database and challenge for acoustic scene classification and event detection

Author: Benetos E.
Giannoulis D.
Lagrange M.
Plumbley M. D.
Rossignol M.
Stowell D.
Publication venue
Publication date: 01/01/2013
Field of study

arXiv.org e-Print Archive

Raw Multi-Channel Audio Source Separation using Multi-Resolution Convolutional Auto-Encoders

Author: Grais Emad M.
Plumbley Mark D.
Ward Dominic
Publication venue
Publication date: 01/03/2018
Field of study

Supervised multi-channel audio source separation requires extracting useful spectral, temporal, and spatial features from the mixed signals. The success of many existing systems is therefore largely dependent on the choice of features used for training. In this work, we introduce a novel multi-channel, multi-resolution convolutional auto-encoder neural network that works on raw time-domain signals to determine appropriate multi-resolution features for separating the singing-voice from stereo music. Our experimental results show that the proposed method can achieve multi-channel audio source separation without the need for hand-crafted features or any pre- or post-processing

University of Surrey

Detection and Classification of Acoustic Scenes and Events

Author: Benetos E
Giannoulis D
Lagrange M
Plumbley MD
Rossignol M
Stowell D
Publication venue
Publication date: 30/12/2013
Field of study

arXiv.org e-Print Archive

Automatic Environmental Sound Recognition: Performance versus Computational Cost

Author: Krstulovic Sacha
Plumbley Mark D.
Sigtia Siddharth
Stark Adam M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/07/2016
Field of study

In the context of the Internet of Things (IoT), sound sensing applications are required to run on embedded platforms where notions of product pricing and form factor impose hard constraints on the available computing power. Whereas Automatic Environmental Sound Recognition (AESR) algorithms are most often developed with limited consideration for computational cost, this article seeks which AESR algorithm can make the most of a limited amount of computing power by comparing the sound classification performance em as a function of its computational cost. Results suggest that Deep Neural Networks yield the best ratio of sound classification accuracy across a range of computational costs, while Gaussian Mixture Models offer a reasonable accuracy at a consistently small cost, and Support Vector Machines stand between both in terms of compromise between accuracy and computational cost

Recommended from our members

Detection and classification of acoustic scenes and events: an IEEE AASP challenge

Author: Benetos E.
Giannoulis D.
Lagrange M.
Plumbley M. D.
Rossignol M.
Stowell D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

arXiv.org e-Print Archive

Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation

Author: Grais Emad M.
Plumbley Mark D.
Ward Dominic
Wierstorf Hagen
Publication venue
Publication date: 28/10/2017
Field of study

In deep neural networks with convolutional layers, each layer typically has fixed-size/single-resolution receptive field (RF). Convolutional layers with a large RF capture global information from the input features, while layers with small RF size capture local details with high resolution from the input features. In this work, we introduce novel deep multi-resolution fully convolutional neural networks (MR-FCNN), where each layer has different RF sizes to extract multi-resolution features that capture the global and local details information from its input features. The proposed MR-FCNN is applied to separate a target audio source from a mixture of many audio sources. Experimental results show that using MR-FCNN improves the performance compared to feedforward deep neural networks (DNNs) and single resolution deep fully convolutional neural networks (FCNNs) on the audio source separation problem.Comment: arXiv admin note: text overlap with arXiv:1703.0801

Crossref

University of Surrey

Recommended from our members

Detection and Classification of Acoustic Scenes and Events

Author: Benetos E.
Giannoulis D.
Lagrange M.
Plumbley M. D.
Stowell D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

For intelligent systems to make best use of the audio modality, it is important that they can recognize not just speech and music, which have been researched as specific tasks, but also general sounds in everyday environments. To stimulate research in this field we conducted a public research challenge: the IEEE Audio and Acoustic Signal Processing Technical Committee challenge on Detection and Classification of Acoustic Scenes and Events (DCASE). In this paper, we report on the state of the art in automatically classifying audio scenes, and automatically detecting and classifying audio events. We survey prior work as well as the state of the art represented by the submissions to the challenge from various research groups. We also provide detail on the organization of the challenge, so that our experience as challenge hosts may be useful to those organizing challenges in similar domains. We created new audio datasets and baseline systems for the challenge; these, as well as some submitted systems, are publicly available under open licenses, to serve as benchmarks for further research in general-purpose machine listening

Crossref

DETECTION AND CLASSIFICATION OF ACOUSTIC SCENES AND EVENTS: AN IEEE AASP CHALLENGE

Author: Benetos E
Giannoulis D
IEEE
Lagrange M
Plumbley MD
Rossignol M
Stowell D
Publication venue
Publication date: 01/01/2013
Field of study

publicationstatus: publishedThis work has been partly supported by ESPRC Leadership Fellowship EP/G007144/1, by EPSRC Grant EP/H043101/1 for QMUL, and by ANR-11-JS03-005-01 for IRCAM. D.G. is funded by a Queen Mary University of London CDTA Research Studentship. E.B. is supported by a City University London Research Fellowship

University of Surrey

Recommended from our members

Big Chord Data Extraction and Mining

Author: Barthet M.
Dykes J.
Kachkaev A.
Plumbley M. D.
Weyde T.
Wolff D.
Publication venue
Publication date: 01/01/2014
Field of study

Harmonic progression is one of the cornerstones of tonal music composition and is thereby essential to many musical styles and traditions. Previous studies have shown that musical genres and composers could be discriminated based on chord progressions modeled as chord n-grams. These studies were however conducted on small-scale datasets and using symbolic music transcriptions. In this work, we apply pattern mining techniques to over 200,000 chord progression sequences out of 1,000,000 extracted from the I Like Music (ILM) commercial music audio collection. The ILM collection spans 37 musical genres and includes pieces released between 1907 and 2013. We developed a single program multiple data parallel computing approach whereby audio feature extraction tasks are split up and run simultaneously on multiple cores. An audio-based chord recognition model (Vamp plugin Chordino) was used to extract the chord progressions from the ILM set. To keep low-weight feature sets, the chord data were stored using a compact binary format. We used the CM-SPADE algorithm, which performs a vertical mining of sequential patterns using co-occurence information, and which is fast and efﬁcient enough to be applied to big data collections like the ILM set. In orderto derive key-independent frequent patterns, the transition between chords are modeled by changes of qualities (e.g. major, minor, etc.) and root keys (e.g. fourth, ﬁfth, etc.). The resulting key-independent chord progression patterns vary in length (from 2 to 16) and frequency (from 2 to 19,820) across genres. As illustrated by graphs generated to represent frequent 4-chord progressions, some patterns like circle-of-ﬁfths movements are well represented in most genres but in varying degrees. These large-scale results offer the opportunity to uncover similarities and discrepancies between sets of musical pieces and therefore to build classiﬁers for search and recommendation. They also support the empirical testing of music theory. It is however more difﬁcult to derive new hypotheses from such dataset due to its size. This can be addressed by using pattern detection algorithms or suitable visualisation which we present in a companion study