Search CORE

41 research outputs found

Apprentissage statistique pour l'étiquetage de musique et la recommandation

Author: Bertin-Mahieux Thierry
Publication venue
Publication date: 01/01/2009
Field of study

Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal

Dépôt Institutionnel Numérique

Large-Scale Pattern Discovery in Music

Author: Bertin-Mahieux Thierry
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2013
Field of study

This work focuses on extracting patterns in musical data from very large collections. The problem is split in two parts. First, we build such a large collection, the Million Song Dataset, to provide researchers access to commercial-size datasets. Second, we use this collection to study cover song recognition which involves finding harmonic patterns from audio features. Regarding the Million Song Dataset, we detail how we built the original collection from an online API, and how we encouraged other organizations to participate in the project. The result is the largest research dataset with heterogeneous sources of data available to music technology researchers. We demonstrate some of its potential and discuss the impact it already has on the field. On cover song recognition, we must revisit the existing literature since there are no publicly available results on a dataset of more than a few thousand entries. We present two solutions to tackle the problem, one using a hashing method, and one using a higher-level feature computed from the chromagram (dubbed the 2DFTM). We further investigate the 2DFTM since it has potential to be a relevant representation for any task involving audio harmonic content. Finally, we discuss the future of the dataset and the hope of seeing more work making use of the different sources of data that are linked in the Million Song Dataset. Regarding cover songs, we explain how this might be a first step towards defining a harmonic manifold of music, a space where harmonic similarities between songs would be more apparent

CiteSeerX

Columbia University Academic Commons

Recommended from our members

Mining Large-Scale Music Data Sets

Author: Bertin-Mahieux Thierry
Ellis Daniel P. W.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

Large collections of music audio are now common and present an interesting research opportunity: what statistical patterns and structure can be discovered across thousands or millions of examples? Unfortunately, copyright restrictions can interfere with access to such collections, so we have developed the Million Song Dataset, including derived features but not the original audio, to support commercial-scale music analysis on a common, research database. The audio features are augmented by a wide range of metadata including lyrics, tags, and listener playcounts. Now the database is ready, we have begun analyzing the content, including tasks such as identifying cover songs -- significantly harder for such a large collection

Columbia University Academic Commons

Recommended from our members

Large-Scale Cover Song Recognition Using the 2D Fourier Transform Magnitude

Author: Ellis Daniel P. W.
Thierry Bertin-Mahieux
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

Large-scale cover song recognition involves calculating item-to-item similarities that can accommodate differences in timing and tempo, rendering simple Euclidean measures unsuitable. Expensive solutions such as dynamic time warping do not scale to million of instances, making them inappropriate for commercial-scale applications. In this work, we transform a beat-synchronous chroma matrix with a 2D Fourier transform and show that the resulting representation has properties that fit the cover song recognition task. We can also apply PCA to efficiently scale comparisons. We report the best results to date on the largest available dataset of around 18,000 cover songs amid one million tracks, giving a mean average precision of 3.0%

Columbia University Academic Commons

Scalable k-Means Clustering via Lightweight Coresets

Author: Arthur David
Bachem Olivier
Bertin-Mahieux Thierry
Jiri Matousek
Reddi Sashank J
Publication venue
Publication date: 01/01/2018
Field of study

Coresets are compact representations of data sets such that models trained on a coreset are provably competitive with models trained on the full data set. As such, they have been successfully used to scale up clustering models to massive data sets. While existing approaches generally only allow for multiplicative approximation errors, we propose a novel notion of lightweight coresets that allows for both multiplicative and additive errors. We provide a single algorithm to construct lightweight coresets for k-means clustering as well as soft and hard Bregman clustering. The algorithm is substantially faster than existing constructions, embarrassingly parallel, and the resulting coresets are smaller. We further show that the proposed approach naturally generalizes to statistical k-means clustering and that, compared to existing results, it can be used to compute smaller summaries for empirical risk minimization. In extensive experiments, we demonstrate that the proposed algorithm outperforms existing data summarization strategies in practice.Comment: To appear in the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Automatic generation of social tags for music recommendation

Author: Douglas Eck
Paul Lamere
Stephen Green
Thierry Bertin-Mahieux
Publication venue
Publication date: 06/03/2020
Field of study

Abstract Social tags are user-generated keywords associated with some resource on the Web. In the case of music, social tags have become an important component of "Web2.0" recommender systems, allowing users to generate playlists based on use-dependent terms such as chill or jogging that have been applied to particular songs. In this paper, we propose a method for predicting these social tags directly from MP3 files. Using a set of boosted classifiers, we map audio features onto social tags collected from the Web. The resulting automatic tags (or autotags) furnish information about music that is otherwise untagged or poorly tagged, allowing for insertion of previously unheard music into a social recommender. This avoids the "cold-start problem" common in such systems. Autotags can also be used to smooth the tag space from which similarities and recommendations are made by providing a set of comparable baseline tags for all tracks in a recommender system

CiteSeerX

Clustering beat-chroma patterns in a large music database

Author: Bertin-Mahieux Thierry
Ellis Daniel P. W.
Weiss Ron J.
Publication venue
Publication date: 01/01/2010
Field of study

A musical style or genre implies a set of common conventions and patterns combined and deployed in different ways to make individual musical pieces; for instance, most would agree that contemporary pop music is assembled from a relatively small palette of harmonic and melodic patterns. The purpose of this paper is to use a database of tens of thousands of songs in combination with a compact representation of melodic-harmonic content (the beat-synchronous chromagram) and data-mining tools (clustering) to attempt to explicitly catalog this palette — at least within the limitations of the beat-chroma representation. We use online k-means clustering to summarize 3.7 million 4-beat bars in a codebook of a few hundred prototypes. By measuring how accurately such a quantized codebook can reconstruct the original data, we can quantify the degree of diversity (distortion as a function of codebook size) and temporal structure (i.e. the advantage gained by joint quantizing multiple frames) in this music. The most popular codewords themselves reveal the common chords used in the music. Finally, the quantized representation of music can be used for music retrieval tasks such as artist and genre classification, and identifying songs that are similar in terms of their melodic-harmonic content

CiteSeerX

Columbia University Academic Commons

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

The Million Song Dataset

Author: Bertin-Mahieux Thierry
Ellis Daniel P. W.
Lamere Paul
Whitman Brian
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

We introduce the Million Song Dataset, a freely-available collection of audio features and metadata for a million contemporary popular music tracks. We describe its creation process, its content, and its possible uses. Attractive features of the Million Song Database include the range of existing resources to which it is linked, and the fact that it is the largest current research dataset in our field. As an illustration, we present year prediction as an example application, a task that has, until now, been difficult to study owing to the absence of a large set of suitable data. We show positive results on year prediction, and discuss more generally the future development of the dataset

CiteSeerX

Columbia University Academic Commons

Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and Distributed Systems

Author: Bertin-Mahieux Thierry
Choi Jaeyoung
Forgy Edward W
Si Si
Williams Christopher
Wright Nicholas J
Zhang Yuchen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/05/2018
Field of study

We propose two new methods to address the weak scaling problems of KRR: the Balanced KRR (BKRR) and K-means KRR (KKRR). These methods consider alternative ways to partition the input dataset into p different parts, generating p different models, and then selecting the best model among them. Compared to a conventional implementation, KKRR2 (optimized version of KKRR) improves the weak scaling efficiency from 0.32% to 38% and achieves a 591times speedup for getting the same accuracy by using the same data and the same hardware (1536 processors). BKRR2 (optimized version of BKRR) achieves a higher accuracy than the current fastest method using less training time for a variety of datasets. For the applications requiring only approximate solutions, BKRR2 improves the weak scaling efficiency to 92% and achieves 3505 times speedup (theoretical speedup: 4096 times).Comment: This paper has been accepted by ACM International Conference on Supercomputing (ICS) 201

arXiv.org e-Print Archive

Crossref

Mining oral history collections using music information retrieval methods

Author: Alice Eldridge
Ben Jackson
Bertin-Mahieux Thierry
Chris Kiefer
Grele Ronald J.
James Baker
Sharon Webb
Stephen Downie J.
Publication venue: 'Informa UK Limited'
Publication date: 02/10/2017
Field of study

Recent work at the Sussex Humanities Lab, a digital humanities research program at the University of Sussex, has sought to address an identified gap in the provision and use of audio feature analysis for spoken word collections. Traditionally, oral history methodologies and practices have placed emphasis on working with transcribed textual surrogates, rather than the digital audio files created during the interview process. This provides a pragmatic access to the basic semantic content, but obviates access to other potentially meaningful aural information; our work addresses the potential for methods to explore this extra-semantic information, by working with the audio directly. Audio analysis tools, such as those developed within the established field of Music Information Retrieval (MIR), provide this opportunity. This paper describes the application of audio analysis techniques and methods to spoken word collections. We demonstrate an approach using freely available audio and data analysis tools, which have been explored and evaluated in two workshops. We hope to inspire new forms of content analysis which complement semantic analysis with investigation into the more nuanced properties carried in audio signals

Crossref

Sussex Research Online