47,664 research outputs found

    Automatic Classification of Digital Music by Genre

    Get PDF
    Presented at the Grace Hopper Celebration of Women in Computing (GHC’12) Research Poster, Baltimore, MD, USA and also presented at the Women in Machine Learning Workshop (WiML ’12), Research Poster, Lake Tahoe, Nevada, USA.Over the past two decades, advances in the digital music industry have resulted in an exponential growth in music data sets. This exponential growth has in turn spurred great interest in music information retrieval (MIR) problems, organizing large music collections, and content-based search methods for digital music libraries. Equally important are the related problems in music classification such as genre classification, music mood analysis, and artist identification. Music genre classification is a well-studied problem in the music information retrieval community and has a wide range of applications. In this project we address the problem of genre classification by representing the MFCC feature vectors in an extended semantic space. We combine this audio representation with machine learning techniques to perform genre classification with the goal of obtaining higher classification accuracy

    PiJAMA: Piano Jazz with Automatic MIDI Annotations

    Get PDF
    Recent advances in automatic piano transcription have enabled large scale analysis of piano music in the symbolic domain. However, the research has largely focused on classical piano music. We present PiJAMA (Piano Jazz with Automatic MIDI Annotations): a dataset of over 200 hours of solo jazz piano performances with automatically transcribed MIDI. In total there are 2,777 unique performances by 120 different pianists across 244 recorded albums. The dataset contains a mixture of studio recordings and live performances. We use automatic audio tagging to identify applause, spoken introductions, and other non-piano audio to facilitate downstream music information retrieval tasks. We explore descriptive statistics of the MIDI data, including pitch histograms and chromaticism. We then demonstrate two experimental benchmarks on the data: performer identification and generative modeling. The dataset, including a link to the associated source code is available at https://almostimplemented.github.io/PiJAMA/

    How Low Can You Go? Reducing Frequency and Time Resolution in Current CNN Architectures for Music Auto-tagging

    Full text link
    Automatic tagging of music is an important research topic in Music Information Retrieval and audio analysis algorithms proposed for this task have achieved improvements with advances in deep learning. In particular, many state-of-the-art systems use Convolutional Neural Networks and operate on mel-spectrogram representations of the audio. In this paper, we compare commonly used mel-spectrogram representations and evaluate model performances that can be achieved by reducing the input size in terms of both lesser amount of frequency bands and larger frame rates. We use the MagnaTagaTune dataset for comprehensive performance comparisons and then compare selected configurations on the larger Million Song Dataset. The results of this study can serve researchers and practitioners in their trade-off decision between accuracy of the models, data storage size and training and inference times.Comment: The 28th European Signal Processing Conference (EUSIPCO
    • …
    corecore