Mining Large-Scale Music Data Sets

Bertin-Mahieux, Thierry; Ellis, Daniel P. W.

Mining Large-Scale Music Data Sets

Authors: Thierry Bertin-Mahieux
Daniel P. W. Ellis
Publication date: 1 January 2012
Publisher: 'Columbia University Libraries/Information Services'
Doi

Abstract

Large collections of music audio are now common and present an interesting research opportunity: what statistical patterns and structure can be discovered across thousands or millions of examples? Unfortunately, copyright restrictions can interfere with access to such collections, so we have developed the Million Song Dataset, including derived features but not the original audio, to support commercial-scale music analysis on a common, research database. The audio features are augmented by a wide range of metadata including lyrics, tags, and listener playcounts. Now the database is ready, we have begun analyzing the content, including tasks such as identifying cover songs -- significantly harder for such a large collection

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Sustaining member

Columbia University Academic Commons

oai:academiccommons.columbia.e...

Last time updated on 02/10/2018