4,321 research outputs found

    Extended pipeline for content-based feature engineering in music genre recognition

    Full text link
    We present a feature engineering pipeline for the construction of musical signal characteristics, to be used for the design of a supervised model for musical genre identification. The key idea is to extend the traditional two-step process of extraction and classification with additive stand-alone phases which are no longer organized in a waterfall scheme. The whole system is realized by traversing backtrack arrows and cycles between various stages. In order to give a compact and effective representation of the features, the standard early temporal integration is combined with other selection and extraction phases: on the one hand, the selection of the most meaningful characteristics based on information gain, and on the other hand, the inclusion of the nonlinear correlation between this subset of features, determined by an autoencoder. The results of the experiments conducted on GTZAN dataset reveal a noticeable contribution of this methodology towards the model's performance in classification task.Comment: ICASSP 201

    Multiscale approaches to music audio feature learning

    Get PDF
    Content-based music information retrieval tasks are typically solved with a two-stage approach: features are extracted from music audio signals, and are then used as input to a regressor or classifier. These features can be engineered or learned from data. Although the former approach was dominant in the past, feature learning has started to receive more attention from the MIR community in recent years. Recent results in feature learning indicate that simple algorithms such as K-means can be very effective, sometimes surpassing more complicated approaches based on restricted Boltzmann machines, autoencoders or sparse coding. Furthermore, there has been increased interest in multiscale representations of music audio recently. Such representations are more versatile because music audio exhibits structure on multiple timescales, which are relevant for different MIR tasks to varying degrees. We develop and compare three approaches to multiscale audio feature learning using the spherical K-means algorithm. We evaluate them in an automatic tagging task and a similarity metric learning task on the Magnatagatune dataset

    Mining User Personality from Music Listening Behavior in Online Platforms Using Audio Attributes

    Get PDF
    Music and emotions are inherently intertwined. Humans leave hints of their personality everywhere, and particularly their music listening behavior shows conscious and unconscious diametric tendencies and influences. So, what could be more elegant than finding the underlying character given the attributes of a certain music piece and, as such, identifying the likelihood that music preference is also imprinted or at least resonating with its listener? This thesis focuses on the music audio attributes or the latent song features to determine human personality. Based on unsupervised learning, we cluster several large music datasets using multiple clustering techniques known to us. This analysis led us to classify song genres based on audio attributes, which can be deemed a novel contribution in the intersection of Music Information Retrieval (MIR) and human psychology studies. Existing research found a relationship between Myers-Briggs personality models and music genres. Our goal was to correlate audio attributes with the music genre, which will ultimately help us to determine user personality based on their music listening behavior from online music platforms. This target has been achieved as we showed the users’ spectral personality traits from the audio feature values of the songs they listen to online and verified our decision process with the help of a customized Music Recommendation System (MRS). Our model performs genre classification and personality detection with 78% and 74% accuracy, respectively. The results are promising compared to competitor approaches as they are explainable via statistics and visualizations. Furthermore, the RS completes and validates our pursuit through 81.3% accurate song suggestions. We believe the outcome of this thesis will work as an inspiration and assistance for fellow researchers in this arena to come up with more personalized song suggestions. As music preferences will shape specific user personality parameters, it is expected that more such elements will surface that would portray the daily activities of individuals and their underlying mentality

    Learning feature hierarchies for musical audio signals

    Get PDF

    Harmonic Change Detection from Musical Audio

    Get PDF
    In this dissertation, we advance an enhanced method for computing Harte et al.’s [31] Harmonic Change Detection Function (HCDF). HCDF aims to detect harmonic transitions in musical audio signals. HCDF is crucial both for the chord recognition in Music Information Retrieval (MIR) and a wide range of creative applications. In light of recent advances in harmonic description and transformation, we depart from the original architecture of Harte et al.’s HCDF, to revisit each one of its component blocks, which are evaluated using an exhaustive grid search aimed to identify optimal parameters across four large style-specific musical datasets. Our results show that the newly proposed methods and parameter optimization improve the detection of harmonic changes, by 5.57% (f-score) with respect to previous methods. Furthermore, while guaranteeing recall values at > 99%, our method improves precision by 6.28%. Aiming to leverage novel strategies for real-time harmonic-content audio processing, the optimized HCDF is made available for Javascript and the MAX and Pure Data multimedia programming environments. Moreover, all the data as well as the Python code used to generate them, are made available.<br /
    corecore