Search CORE

780 research outputs found

Recommended from our members

Non-Negative Tensor Factorization Applied to Music Genre Classification

Author: Benetos E.
Kotropoulos C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2010
Field of study

Music genre classification techniques are typically applied to the data matrix whose columns are the feature vectors extracted from music recordings. In this paper, a feature vector is extracted using a texture window of one sec, which enables the representation of any 30 sec long music recording as a time sequence of feature vectors, thus yielding a feature matrix. Consequently, by stacking the feature matrices associated to any dataset recordings, a tensor is created, a fact which necessitates studying music genre classification using tensors. First, a novel algorithm for non-negative tensor factorization (NTF) is derived that extends the non-negative matrix factorization. Several variants of the NTF algorithm emerge by employing different cost functions from the class of Bregman divergences. Second, a novel supervised NTF classifier is proposed, which trains a basis for each class separately and employs basis orthogonalization. A variety of spectral, temporal, perceptual, energy, and pitch descriptors is extracted from 1000 recordings of the GTZAN dataset, which are distributed across 10 genre classes. The NTF classifier performance is compared against that of the multilayer perceptron and the support vector machines by applying a stratified 10-fold cross validation. A genre classification accuracy of 78.9% is reported for the NTF classifier demonstrating the superiority of the aforementioned multilinear classifier over several data matrix-based state-of-the-art classifiers

City Research Online

Music genre classification via Topology Preserving Non-Negative Tensor Factorization and sparse representations

Author: Constantine Kotropoulos
Yannis Panagakis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Motivated by the rich, psycho-physiologically grounded proper-ties of auditory cortical representations and the power of sparse representation-based classifiers, we propose a robust music genre classification framework. Its first pilar is a novel multilinear sub-space analysis method that reduces the dimensionality of cortical representations of music signals, while preserving the topology of the cortical representations. Its second pilar is the sparse representa-tion based classification, that models any test cortical representation as a sparse weighted sum of dictionary atoms, which stem from training cortical representations of known genre, by assuming that the representations of music recordings of the same genre are close enough in the tensor space they lie. Accordingly, the dimensionality reduction is made in a compatible manner to the working princi-ple of the sparse-representation based classification. Music genre classification accuracy of 93.7 % and 94.93 % is reported on the GTZAN and the ISMIR2004 Genre datasets, respectively. Both accuracies outperform any accuracy ever reported for state of the art music genre classification algorithms applied to the aforementioned datasets. Index Terms — Music genre classification, topology preserving, non-negative tensor factorization, sparse representations 1

CiteSeerX

Crossref

The GTZAN dataset: Its contents, its faults, their effects on evaluation, and its future use

Author: Sturm Bob L.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2013
Field of study

The GTZAN dataset appears in at least 100 published works, and is the most-used public dataset for evaluation in machine listening research for music genre recognition (MGR). Our recent work, however, shows GTZAN has several faults (repetitions, mislabelings, and distortions), which challenge the interpretability of any result derived using it. In this article, we disprove the claims that all MGR systems are affected in the same ways by these faults, and that the performances of MGR systems in GTZAN are still meaningfully comparable since they all face the same faults. We identify and analyze the contents of GTZAN, and provide a catalog of its faults. We review how GTZAN has been used in MGR research, and find few indications that its faults have been known and considered. Finally, we rigorously study the effects of its faults on evaluating five different MGR systems. The lesson is not to banish GTZAN, but to use it with consideration of its contents.Comment: 29 pages, 7 figures, 6 tables, 128 reference

arXiv.org e-Print Archive

VBN

A tensor-based approach for automatic music genre classification

Author: Benetos E.
Kotropoulos C.
Publication venue
Publication date: 01/01/2008
Field of study

Most music genre classification techniques employ pattern recognition algorithms to classify feature vectors extracted from recordings into genres. An automatic music genre classification system using tensor representations is proposed, where each recording is represented by a feature matrix over time. Thus, a feature tensor is created by concatenating the feature matrices associated to the recordings. A novel algorithm for non-negative tensor factorization (NTF), which employs the Frobenius norm between an n-dimensional raw feature tensor and its decomposition into a sum of elementary rank-1 tensors, is developed. Moreover, a supervised NTF classifier is proposed. A variety of sound description features are extracted from recordings from the GTZAN dataset, covering 10 genre classes. NTF classifier performance is compared against multilayer perceptrons, support vector machines, and non-negative matrix factorization classifiers. On average, genre classification accuracy equal to 75% with a standard deviation of 1% is achieved. It is demonstrated that NTF classifiers outperform matrix-based ones

CiteSeerX

City Research Online

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Music genre classification: a multilinear approach

Author: Benetos E.
Kotropoulos C.
Panagakis I.
Publication venue
Publication date: 01/01/2008
Field of study

In this paper, music genre classification is addressed in a multilinear perspective. Inspired by a model of auditory cortical processing, multiscale spectro-temporal modulation features are extracted. Such spectro-temporal modulation features have been successfully used in various content- based audio classification tasks recently, but not yet in music genre classification. Each recording is represented by a third-order feature tensor generated by the auditory model. Thus, the ensemble of recordings is represented by a fourth-order data tensor created by stacking the third-order feature tensors associated to the recordings. To handle large data tensors and derive compact feature vectors suitable for classification, three multilinear subspace techniques are examined, namely the Non-Negative Tensor Factorization (NTF), the High-Order Singular Value Decomposition (HOSVD), and the Multilinear Principal Component Analysis (MPCA). Classification is performed by a Support Vector Machine. Stratified cross-validation tests on the GTZAN dataset and the ISMIR 2004 Genre one demonstrate the advantages of NTF and HOSVD versus MPCA. The best accuracies obtained by the proposed multilinear approach is comparable with those achieved by state-of-the-art music genre classification algorithms

CiteSeerX

City Research Online

Type-Constrained Representation Learning in Knowledge Graphs

Author: A Swartz
C Bizer
C Bizer
D Krompaß
GA Miller
N Lao
Publication venue
Publication date: 28/08/2015
Field of study

Large knowledge graphs increasingly add value to various applications that require machines to recognize and understand queries and their semantics, as in search or question answering systems. Latent variable models have increasingly gained attention for the statistical modeling of knowledge graphs, showing promising results in tasks related to knowledge graph completion and cleaning. Besides storing facts about the world, schema-based knowledge graphs are backed by rich semantic descriptions of entities and relation-types that allow machines to understand the notion of things and their semantic relationships. In this work, we study how type-constraints can generally support the statistical modeling with latent variable models. More precisely, we integrated prior knowledge in form of type-constraints in various state of the art latent variable approaches. Our experimental results show that prior knowledge on relation-types significantly improves these models up to 77% in link-prediction tasks. The achieved improvements are especially prominent when a low model complexity is enforced, a crucial requirement when these models are applied to very large datasets. Unfortunately, type-constraints are neither always available nor always complete e.g., they can become fuzzy when entities lack proper typing. We show that in these cases, it can be beneficial to apply a local closed-world assumption that approximates the semantics of relation-types based on observations made in the data

arXiv.org e-Print Archive

Crossref

INSTRUMENTATION-BASED MUSIC SIMILARITY USING SPARSE REPRESENTATIONS

Author: Fujihara H
IEEE
Klapuri A
Plumbley MD
Publication venue
Publication date: 01/01/2012
Field of study

© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Queen Mary Research Online

Context based multimedia information retrieval

Author: Mølgaard Lasse Lohilahti
Publication venue: Technical University of Denmark
Publication date: 01/12/2009
Field of study

Online Research Database In Technology

Classification of music genres using sparse representations in overcomplete dictionaries

Author: Rusu Cristian
Publication venue
Publication date: 01/01/2011
Field of study

This paper presents a simple, but efficient and robust, method for music genre classification that utilizes sparse representations in overcomplete dictionaries. The training step involves creating dictionaries, using the K-SVD algorithm, in which data corresponding to a particular music genre has a sparse representation. In the classification step, the Orthogonal Matching Pursuit (OMP) algorithm is used to separate feature vectors that consist only of Linear Predictive Coding (LPC) coefficients. The paper analyses in detail a popular case study from the literature, the ISMIR 2004 database. Using the presented method, the correct classification percentage of the 6 music genres is 85.59, result that is comparable with the best results published so far

TRAP

IMT Institutional Repository