Search CORE

290,969 research outputs found

Learning sound representations using trainable COPE feature extractors

Author: Petkov Nicolai
Strisciuglio Nicola
Vento Mario
Publication venue: 'Elsevier BV'
Publication date: 22/03/2019
Field of study

Sound analysis research has mainly been focused on speech and music processing. The deployed methodologies are not suitable for analysis of sounds with varying background noise, in many cases with very low signal-to-noise ratio (SNR). In this paper, we present a method for the detection of patterns of interest in audio signals. We propose novel trainable feature extractors, which we call COPE (Combination of Peaks of Energy). The structure of a COPE feature extractor is determined using a single prototype sound pattern in an automatic configuration process, which is a type of representation learning. We construct a set of COPE feature extractors, configured on a number of training patterns. Then we take their responses to build feature vectors that we use in combination with a classifier to detect and classify patterns of interest in audio signals. We carried out experiments on four public data sets: MIVIA audio events, MIVIA road events, ESC-10 and TU Dortmund data sets. The results that we achieved (recognition rate equal to 91.71% on the MIVIA audio events, 94% on the MIVIA road events, 81.25% on the ESC-10 and 94.27% on the TU Dortmund) demonstrate the effectiveness of the proposed method and are higher than the ones obtained by other existing approaches. The COPE feature extractors have high robustness to variations of SNR. Real-time performance is achieved even when the value of a large number of features is computed.Comment: Accepted for publication in Pattern Recognitio

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Recommended from our members

Big Chord Data Extraction and Mining

Author: Barthet M.
Dykes J.
Kachkaev A.
Plumbley M. D.
Weyde T.
Wolff D.
Publication venue
Publication date: 01/01/2014
Field of study

Harmonic progression is one of the cornerstones of tonal music composition and is thereby essential to many musical styles and traditions. Previous studies have shown that musical genres and composers could be discriminated based on chord progressions modeled as chord n-grams. These studies were however conducted on small-scale datasets and using symbolic music transcriptions. In this work, we apply pattern mining techniques to over 200,000 chord progression sequences out of 1,000,000 extracted from the I Like Music (ILM) commercial music audio collection. The ILM collection spans 37 musical genres and includes pieces released between 1907 and 2013. We developed a single program multiple data parallel computing approach whereby audio feature extraction tasks are split up and run simultaneously on multiple cores. An audio-based chord recognition model (Vamp plugin Chordino) was used to extract the chord progressions from the ILM set. To keep low-weight feature sets, the chord data were stored using a compact binary format. We used the CM-SPADE algorithm, which performs a vertical mining of sequential patterns using co-occurence information, and which is fast and efﬁcient enough to be applied to big data collections like the ILM set. In orderto derive key-independent frequent patterns, the transition between chords are modeled by changes of qualities (e.g. major, minor, etc.) and root keys (e.g. fourth, ﬁfth, etc.). The resulting key-independent chord progression patterns vary in length (from 2 to 16) and frequency (from 2 to 19,820) across genres. As illustrated by graphs generated to represent frequent 4-chord progressions, some patterns like circle-of-ﬁfths movements are well represented in most genres but in varying degrees. These large-scale results offer the opportunity to uncover similarities and discrepancies between sets of musical pieces and therefore to build classiﬁers for search and recommendation. They also support the empirical testing of music theory. It is however more difﬁcult to derive new hypotheses from such dataset due to its size. This can be addressed by using pattern detection algorithms or suitable visualisation which we present in a companion study

City Research Online

Surrey Research Insight

Recommended from our members

Representing chord sequences in OWL

Author: Conklin D.
Weyde T.
Wissmann Jens
Publication venue
Publication date: 01/01/2010
Field of study

Chord symbols and progressions are a common way to describe musical harmony. In this paper we present SEQ, a pattern representation using the Web Ontology Language OWL DL and its application to modelling chord sequences. SEQ provides a logical representation of order information, which is not available directly in OWL DL, together with an intuitive notation. It therefore allows the use of OWL reasoners for tasks such as classification of sequences by patterns and determining subsumption relationships between the patterns. The SEQ representation is used to express distinctive pattern obtained using data mining of multiple viewpoints of chord sequences

City Research Online

Recommended from our members

Improving music genre classification using automatically induced harmony rules

Author: Anglade A.
Benetos E.
Dixon S.
Mauch M.
Publication venue: 'Informa UK Limited'
Publication date: 01/12/2010
Field of study

We present a new genre classification framework using both low-level signal-based features and high-level harmony features. A state-of-the-art statistical genre classifier based on timbral features is extended using a first-order random forest containing for each genre rules derived from harmony or chord sequences. This random forest has been automatically induced, using the first-order logic induction algorithm TILDE, from a dataset, in which for each chord the degree and chord category are identified, and covering classical, jazz and pop genre classes. The audio descriptor-based genre classifier contains 206 features, covering spectral, temporal, energy, and pitch characteristics of the audio signal. The fusion of the harmony-based classifier with the extracted feature vectors is tested on three-genre subsets of the GTZAN and ISMIR04 datasets, which contain 300 and 448 recordings, respectively. Machine learning classifiers were tested using 5 × 5-fold cross-validation and feature selection. Results indicate that the proposed harmony-based rules combined with the timbral descriptor-based genre classification system lead to improved genre classification rates

City Research Online

Analysis of analysis: importance of different musical parameters for Schenkerian analysis

Author: Kirlin Phillip B.
Yust Jason
Publication venue: 'Informa UK Limited'
Publication date: 17/10/2016
Field of study

While criteria for Schenkerian analysis have been much discussed, such discussions have generally not been informed by data. Kirlin [Kirlin, Phillip B., 2014 “A Probabilistic Model of Hierarchical Music Analysis.” Ph.D. thesis, University of Massachusetts Amherst] has begun to fill this vacuum with a corpus of textbook Schenkerian analyses encoded using data structures suggested byYust [Yust, Jason, 2006 “Formal Models of Prolongation.” Ph.D. thesis, University of Washington] and a machine learning algorithm based on this dataset that can produce analyses with a reasonable degree of accuracy. In this work, we examine what musical features (scale degree, harmony, metrical weight) are most significant in the performance of Kirlin's algorithm.Accepted manuscrip

Boston University Institutional Repository (OpenBU)