Search CORE

1,868 research outputs found

A Deep Representation for Invariance And Music Classification

Author: Evangelopoulos Georgios
Poggio Tomaso
Rosasco Lorenzo
Voinea Stephen
Zhang Chiyuan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Representations in the auditory cortex might be based on mechanisms similar to the visual ventral stream; modules for building invariance to transformations and multiple layers for compositionality and selectivity. In this paper we propose the use of such computational modules for extracting invariant and discriminative audio representations. Building on a theory of invariance in hierarchical architectures, we propose a novel, mid-level representation for acoustical signals, using the empirical distributions of projections on a set of templates and their transformations. Under the assumption that, by construction, this dictionary of templates is composed from similar classes, and samples the orbit of variance-inducing signal transformations (such as shift and scale), the resulting signature is theoretically guaranteed to be unique, invariant to transformations and stable to deformations. Modules of projection and pooling can then constitute layers of deep networks, for learning composite representations. We present the main theoretical and computational aspects of a framework for unsupervised learning of invariant audio representations, empirically evaluated on music genre classification.Comment: 5 pages, CBMM Memo No. 002, (to appear) IEEE 2014 International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Recommended from our members

An approach to melodic segmentation and classification based on filtering with the Haar wavelet

Author: Berger J.
Brown M.
Cambouropoulos E.
Daubechies I.
David Meredith
Dreyfus L.
Forte A.
Gissel Velarde
Huron D.
Huron D.
Lerdahl F.
Levitin D. J.
Mallat S.
Nixon M. S.
Ponce de Léon P. J.
Tillman Weyde
Trainor L. J.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2013
Field of study

We present a novel method of classification and segmentation of melodies in symbolic representation. The method is based on filtering pitch as a signal over time with the Haar-wavelet, and we evaluate it on two tasks. The filtered signal corresponds to a single-scale signal ws from the continuous Haar wavelet transform. The melodies are first segmented using local maxima or zero-crossings of ws. The segments of ws are then classified using the k–nearest neighbour algorithm with Euclidian and city-block distances. The method proves more effective than using unfiltered pitch signals and Gestalt-based segmentation when used to recognize the parent works of segments from Bach’s Two-Part Inventions (BWV 772–786). When used to classify 360 Dutch folk tunes into 26 tune families, the performance of the method is comparable to the use of pitch signals, but not as good as that of string-matching methods based on multiple features

City Research Online

Crossref

VBN

Wavelet-filtering of symbolic music representations for folk tune segmentation and classification

Author: Meredith David
Velarde Gissel
Weyde Tillman
Publication venue: Meertens Institute; Department of Information and Computing Sciences; Utrecht University
Publication date: 05/06/2013
Field of study

The aim of this study is to evaluate a machine-learning method in which symbolic representations of folk songs are segmented and classified into tune families with Haar-wavelet filtering. The method is compared with previously proposed Gestaltbased method. Melodies are represented as discrete symbolic pitch-time signals. We apply the continuous wavelet transform (CWT) with the Haar wavelet at specific scales, obtaining filtered versions of melodies emphasizing their information at particular time-scales. We use the filtered signal for representation and segmentation, using the wavelet coefficients ’ local maxima to indicate local boundaries and classify segments by means of k-nearest neighbours based on standard vector-metrics (Euclidean, cityblock), and compare the results to a Gestalt-based segmentation method and metrics applied directly to the pitch signal. We found that the wavelet based segmentation and waveletfiltering of the pitch signal lead to better classification accuracy in cross-validated evaluation when the time-scale and other parameters are optimized. 1

CiteSeerX

VBN

Convolutional Methods for Music Analysis

Author: Velarde Gissel
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2017
Field of study

VBN