30,017 research outputs found

    Learning content-based metrics for music similarity

    Get PDF
    In this abstract, we propose a method to learn application-specific content-based metrics for music similarity using unsupervised feature learning and neighborhood components analysis. Multiple-timescale features extracted from music audio are embedded into a Euclidean metric space, so that the distance between songs reflects their similarity. We evaluated the method on the GTZAN and Magnatagatune datasets

    A Systematic Comparison of Music Similarity Adaptation Approaches

    Get PDF
    In order to support individual user perspectives and different retrieval tasks, music similarity can no longer be considered as a static element of Music Information Retrieval (MIR) systems. Various approaches have been proposed recently that allow dynamic adaptation of music similarity measures. This paper provides a systematic comparison of algorithms for metric learning and higher-level facet distance weighting on the MagnaTagATune dataset. A crossvalidation variant taking into account clip availability is presented. Applied on user generated similarity data, its effect on adaptation performance is analyzed. Special attention is paid to the amount of training data necessary for making similarity predictions on unknown data, the number of model parameters and the amount of information available about the music itself. 1

    Multiscale approaches to music audio feature learning

    Get PDF
    Content-based music information retrieval tasks are typically solved with a two-stage approach: features are extracted from music audio signals, and are then used as input to a regressor or classifier. These features can be engineered or learned from data. Although the former approach was dominant in the past, feature learning has started to receive more attention from the MIR community in recent years. Recent results in feature learning indicate that simple algorithms such as K-means can be very effective, sometimes surpassing more complicated approaches based on restricted Boltzmann machines, autoencoders or sparse coding. Furthermore, there has been increased interest in multiscale representations of music audio recently. Such representations are more versatile because music audio exhibits structure on multiple timescales, which are relevant for different MIR tasks to varying degrees. We develop and compare three approaches to multiscale audio feature learning using the spherical K-means algorithm. We evaluate them in an automatic tagging task and a similarity metric learning task on the Magnatagatune dataset

    Perceptual musical similarity metric learning with graph neural networks

    Get PDF
    Sound retrieval for assisted music composition depends on evaluating similarity between musical instrument sounds, which is partly influenced by playing techniques. Previous methods utilizing Euclidean nearest neighbours over acoustic features show some limitations in retrieving sounds sharing equivalent timbral properties, but potentially generated using a different instrument, playing technique, pitch or dynamic. In this paper, we present a metric learning system designed to approximate human similarity judgments between extended musical playing techniques using graph neural networks. Such structure is a natural candidate for solving similarity retrieval tasks, yet have seen little application in modelling perceptual music similarity. We optimize a Graph Convolutional Network (GCN) over acoustic features via a proxy metric learning loss to learn embeddings that reflect perceptual similarities. Specifically, we construct the graph's adjacency matrix from the acoustic data manifold with an example-wise adaptive k-nearest neighbourhood graph: Adaptive Neighbourhood Graph Neural Network (AN-GNN). Our approach achieves 96.4% retrieval accuracy compared to 38.5% with a Euclidean metric and 86.0% with a multilayer perceptron (MLP), while effectively considering retrievals from distinct playing techniques to the query example

    Training of Tonal Similarity Ratings in Non-Musicians: A “Rapid Learning” Approach

    Get PDF
    Although cognitive music psychology has a long tradition of expert–novice comparisons, experimental training studies are rare. Studies on the learning progress of trained novices in hearing harmonic relationships are still largely lacking. This paper presents a simple training concept using the example of tone/triad similarity ratings, demonstrating the gradual progress of non-musicians compared to musical experts: In a feedback-based “rapid learning” paradigm, participants had to decide for single tones and chords whether paired sounds matched each other well. Before and after the training sessions, they provided similarity judgments for a complete set of sound pairs. From these similarity matrices, individual relational sound maps, intended to display mental representations, were calculated by means of non-metric multidimensional scaling (NMDS), and were compared to an expert model through procrustean transformation. Approximately half of the novices showed substantial learning success, with some participants even reaching the level of professional musicians. Results speak for a fundamental ability to quickly train an understanding of harmony, show inter-individual differences in learning success, and demonstrate the suitability of the scaling method used for learning research in music and other domains. Results are discussed in the context of the “giftedness” debate
    • 

    corecore