Search CORE

8 research outputs found

Exploring new features for music classification

Author: Essid Slim
Foucard Rémi
Lagrange Mathieu
Richard Gael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/07/2013
Field of study

International audienceAutomatic music classification aims at grouping unknown songs in predefined categories such as music genre or induced emotion. To obtain perceptually relevant results, it is needed to design appropriate features that carry important information for semantic inference. In this paper, we explore novel features and evaluate them in a task of music automatic tagging. The proposed features span various aspects of the music: timbre, textual metadata, visual descriptors of cover art, and features characterizing the lyrics of sung music. The merit of these novel features is then evaluated using a classification system based on a boosting algorithm on binary decision trees. Their effectiveness for the task at hand is discussed with reference to the very common Mel Frequency Cepstral Coefficients features. We show that some of these features alone bring useful information, and that the classification system takes great advantage of a description covering such diverse aspects of songs

Recommended from our members

Combining Sources of Description for Approximating Music Similarity Ratings

Author: D.R. Turnbull
I. Tsochantaridis
J.V. Davis
S. Stober
T. Joachims
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

In this paper, we compare the effectiveness of basic acoustic features and genre annotations when adapting a music similarity model to user ratings. We use the Metric Learning to Rank algorithm to learn a Mahalanobis metric from comparative similarity ratings in in the MagnaTagATune database. Using common formats for feature data, our approach can easily be transferred to other existing databases. Our results show that genre data allow more effective learning of a metric than simple audio features, but a combination of both feature sets clearly outperforms either individual set

City Research Online

Crossref

Learning Combinations of Multiple Feature Representations for Music Emotion Prediction

Author: Aucouturier J.-J.
Barrington L.
Barthet M.
Fu Z.
Jensen J. H.
Kim Y. E.
Madsen J.
Madsen J.
Mathieu B.
Meng A.
Moore B. C.
Müller M.
Nickisch H.
Rasmussen C. E.
Su L.
Thurstone L. L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Music consists of several structures and patterns evolving through time which greatly influences the human decoding of higher-level cognitive aspects of music like the emotions expressed in music. For tasks, such as genre, tag and emotion recognition, these structures have often been identified and used as individual and non-temporal features and representations. In this work, we address the hypothesis whether using multiple temporal and non-temporal representations of different features is beneficial for modeling music structure with the aim to predict the emotions expressed in music. We test this hypothesis by representing temporal and non-temporal structures using generative models of multiple audio features. The representations are used in a discriminative setting via the Product Probability Kernel and the Gaussian Process model enabling Multiple Kernel Learning, finding optimized combinations of both features and temporal/ non-temporal representations. We show the increased predictive performance using the combination of different features and representations along with the great interpretive prospects of this approach

Crossref

Enlighten

Online Research Database In Technology

The GTZAN dataset: Its contents, its faults, their effects on evaluation, and its future use

Author: Sturm Bob L.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2013
Field of study

The GTZAN dataset appears in at least 100 published works, and is the most-used public dataset for evaluation in machine listening research for music genre recognition (MGR). Our recent work, however, shows GTZAN has several faults (repetitions, mislabelings, and distortions), which challenge the interpretability of any result derived using it. In this article, we disprove the claims that all MGR systems are affected in the same ways by these faults, and that the performances of MGR systems in GTZAN are still meaningfully comparable since they all face the same faults. We identify and analyze the contents of GTZAN, and provide a catalog of its faults. We review how GTZAN has been used in MGR research, and find few indications that its faults have been known and considered. Finally, we rigorously study the effects of its faults on evaluating five different MGR systems. The lesson is not to banish GTZAN, but to use it with consideration of its contents.Comment: 29 pages, 7 figures, 6 tables, 128 reference

arXiv.org e-Print Archive

Crossref

VBN

Integration of top-down and bottom-up information for audio organization and retrieval

Author: Jensen Bjørn Sand
Publication venue: Technical University of Denmark
Publication date: 01/01/2012
Field of study

Online Research Database In Technology