4,077 research outputs found
Recommended from our members
Combining Sources of Description for Approximating Music Similarity Ratings
In this paper, we compare the effectiveness of basic acoustic features and genre annotations when adapting a music similarity model to user ratings. We use the Metric Learning to Rank algorithm to learn a Mahalanobis metric from comparative similarity ratings in in the MagnaTagATune database. Using common formats for feature data, our approach can easily be transferred to other existing databases. Our results show that genre data allow more effective learning of a metric than simple audio features, but a combination of both feature sets clearly outperforms either individual set
Recommended from our members
Learning music similarity from relative user ratings
Computational modelling of music similarity is an increasingly important part of personalisation and optimisation in music information retrieval and research in music perception and cognition. The use of relative similarity ratings is a new and promising approach to modelling similarity that avoids well known problems with absolute ratings. In this article, we use relative ratings from the MagnaTagATune dataset with new and existing variants of state-of-the-art algorithms and provide the first comprehensive and rigorous evaluation of this approach. We compare metric learning based on support vector machines (SVMs) and metric-learning-to-rank (MLR), including a diagonal and a novel weighted variant, and relative distance learning with neural networks (RDNN). We further evaluate the effectiveness of different high and low level audio features and genre data, as well as dimensionality reduction methods, weighting of similarity ratings, and different sampling methods. Our results show that music similarity measures learnt on relative ratings can be significantly better than a standard Euclidian metric, depending on the choice of learning algorithm, feature sets and application scenario. MLR and SVM outperform DMLR and RDNN, while MLR with weighted ratings leads to no further performance gain. Timbral and music-structural features are most effective, and all features jointly are significantly better than any other combination of feature sets. Sharing audio clips (but not the similarity ratings) between test and training sets improves performance, in particular for the SVM-based methods, which is useful for some applications scenarios. A testing framework has been implemented in Matlab and made publicly available http://mi.soi.city.ac.uk/datasets/ir2012framework so that these results are reproducible
Current Challenges and Visions in Music Recommender Systems Research
Music recommender systems (MRS) have experienced a boom in recent years,
thanks to the emergence and success of online streaming services, which
nowadays make available almost all music in the world at the user's fingertip.
While today's MRS considerably help users to find interesting music in these
huge catalogs, MRS research is still facing substantial challenges. In
particular when it comes to build, incorporate, and evaluate recommendation
strategies that integrate information beyond simple user--item interactions or
content-based descriptors, but dig deep into the very essence of listener
needs, preferences, and intentions, MRS research becomes a big endeavor and
related publications quite sparse.
The purpose of this trends and survey article is twofold. We first identify
and shed light on what we believe are the most pressing challenges MRS research
is facing, from both academic and industry perspectives. We review the state of
the art towards solving these challenges and discuss its limitations. Second,
we detail possible future directions and visions we contemplate for the further
evolution of the field. The article should therefore serve two purposes: giving
the interested reader an overview of current challenges in MRS research and
providing guidance for young researchers by identifying interesting, yet
under-researched, directions in the field
Recommender Systems
The ongoing rapid expansion of the Internet greatly increases the necessity
of effective recommender systems for filtering the abundant information.
Extensive research for recommender systems is conducted by a broad range of
communities including social and computer scientists, physicists, and
interdisciplinary researchers. Despite substantial theoretical and practical
achievements, unification and comparison of different approaches are lacking,
which impedes further advances. In this article, we review recent developments
in recommender systems and discuss the major challenges. We compare and
evaluate available algorithms and examine their roles in the future
developments. In addition to algorithms, physical aspects are described to
illustrate macroscopic behavior of recommender systems. Potential impacts and
future directions are discussed. We emphasize that recommendation has a great
scientific depth and combines diverse research fields which makes it of
interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports
Evaluating Recommender Systems Qualitatively: A survey and Comparative Analysis
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsRecommender systems have improved users' online quality of life by helping them find interesting and
valuable items within a large item set. Most recommender system validation research has focused on
accuracy metrics, studying the differences between the predicted and actual user ratings. However,
recent research has found accuracy to underperform when systems go live, mainly due to accuracy’s
inability to validate recommendation lists as a single entity, and shifted to evaluating recommender
systems using "beyond-accuracy" metrics, like novelty and diversity.
In this dissertation, we summarize and organize the leading research regarding the definitions and
objectives of the beyond-accuracy metrics. Such metrics include coverage, diversity, novelty,
serendipity, unexpectedness, utility, and fairness. The behaviors and relationships of these metrics are
analyzed using four different models, two concerning the items characteristics (item-based) and two
regarding the user behaviors (user-based). Furthermore, a new metric is proposed that allows the
comparison of different models considering their overall beyond-accuracy performance. Using this
metric, a reraking approach is designed to improve the performance of a system, aiming to achieve
better recommendations. The impact of the reranking technique on each metric and algorithm is
studied, and the accuracy and non-accuracy performance of each system is compared. We realized
that, although the reranking technique can increase most beyond-accuracy metrics, the accuracy of
that system starts to worsen due to the negative correlation between these two dimensions. We also
found that item-based models tend to achieve much lower values of coverage and diversity than userbased models
Learning a feature space for similarity in world music
In this study we investigate computational methods for assessing music similarity in world music styles. We use state-of-the-art audio features to describe musical content in world music recordings. Our music collection is a subset of the Smithsonian Folkways Recordings with audio examples from 31 countries from around the world. Using supervised and unsupervised dimensionality reduction techniques we learn feature representations for music similarity. We evaluate how well music styles separate in this learned space with a classification experiment. We obtained moderate performance classifying the recordings by country. Analysis of misclassifications revealed cases of geographical or cultural proximity. We further evaluate the learned space by detecting outliers, i.e. identifying recordings that stand out in the collection. We use a data mining technique based on Mahalanobis distances to detect outliers and perform a listening experiment in the ‘odd one out’ style to evaluate our findings. We are able to detect, amongst others, recordings of non-musical content as outliers as well as music with distinct timbral and harmonic content. The listening experiment reveals moderate agreement between subjects’ ratings and our outlier estimation
A Standardised Procedure for Evaluating Creative Systems: Computational Creativity Evaluation Based on What it is to be Creative
Computational creativity is a flourishing research area, with a variety of creative systems being produced and developed. Creativity evaluation has not kept pace with system development with an evident lack of systematic evaluation of the creativity of these systems in the literature. This is partially due to difficulties in defining what it means for a computer to be creative; indeed, there is no consensus on this for human creativity, let alone its computational equivalent. This paper proposes a Standardised Procedure for Evaluating Creative Systems (SPECS). SPECS is a three-step process: stating what it means for a particular computational system to be creative, deriving and performing tests based on these statements. To assist this process, the paper offers a collection of key components of creativity, identified empirically from discussions of human and computational creativity. Using this approach, the SPECS methodology is demonstrated through a comparative case study evaluating computational creativity systems that improvise music
- …