research

Enhancing timbre model using MFCC and its time derivatives for music similarity estimation

Abstract

One of the popular methods for content-based music similarity estimation is to model timbre with MFCC as a single multivariate Gaussian with full covariance matrix, then use symmetric Kullback-Leibler divergence. From the field of speech recognition, we propose to use the same approach on the MFCCs’ time derivatives to enhance the timbre model. The Gaussian models for the delta and acceleration coefficients are used to create their respective distance matrix. The distance matrices are then combined linearly to form a full distance matrix for music similarity estimation. In our experiments on two datasets, our novel approach performs better than using MFCC alone.Moreover, performing genre classification using k-NN showed that the accuracies obtained are already close to the state-of-the-art

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 04/01/2018