458 research outputs found
A Scalable Framework for Automatic Playlist Continuation on Music Streaming Services
Music streaming services often aim to recommend songs for users to extend the
playlists they have created on these services. However, extending playlists
while preserving their musical characteristics and matching user preferences
remains a challenging task, commonly referred to as Automatic Playlist
Continuation (APC). Besides, while these services often need to select the best
songs to recommend in real-time and among large catalogs with millions of
candidates, recent research on APC mainly focused on models with few
scalability guarantees and evaluated on relatively small datasets. In this
paper, we introduce a general framework to build scalable yet effective APC
models for large-scale applications. Based on a represent-then-aggregate
strategy, it ensures scalability by design while remaining flexible enough to
incorporate a wide range of representation learning and sequence modeling
techniques, e.g., based on Transformers. We demonstrate the relevance of this
framework through in-depth experimental validation on Spotify's Million
Playlist Dataset (MPD), the largest public dataset for APC. We also describe
how, in 2022, we successfully leveraged this framework to improve APC in
production on Deezer. We report results from a large-scale online A/B test on
this service, emphasizing the practical impact of our approach in such a
real-world application.Comment: Accepted as a Full Paper at the SIGIR 2023 conferenc
A collaborative filtering method for music recommendation
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsThe present dissertation focuses on proposing and describing a collaborative filtering approach for
Music Recommender Systems. Music Recommender Systems, which are part of a broader class of
Recommender Systems, refer to the task of automatically filtering data to predict the songs that are
more likely to match a particular profile.
So far, academic researchers have proposed a variety of machine learning approaches for determining
which tracks to recommend to users. The most sophisticated among them consist, often, on complex
learning techniques which can also require considerable computational resources. However, recent
research studies proved that more simplistic approaches based on nearest neighbors could lead to
good results, often at much lower computational costs, representing a viable alternative solution to
the Music Recommender System problem.
Throughout this thesis, we conduct offline experiments on a freely-available collection of listening
histories from real users, each one containing several different music tracks. We extract a subset of 10
000 songs to assess the performance of the proposed system, comparing it with a Popularity-based
model approach. Furthermore, we provide a conceptual overview of the recommendation problem,
describing the state-of-the-art methods, and presenting its current challenges. Finally, the last section
is dedicated to summarizing the essential conclusions and presenting possible future improvements
Large-Scale User Modeling with Recurrent Neural Networks for Music Discovery on Multiple Time Scales
The amount of content on online music streaming platforms is immense, and
most users only access a tiny fraction of this content. Recommender systems are
the application of choice to open up the collection to these users.
Collaborative filtering has the disadvantage that it relies on explicit
ratings, which are often unavailable, and generally disregards the temporal
nature of music consumption. On the other hand, item co-occurrence algorithms,
such as the recently introduced word2vec-based recommenders, are typically left
without an effective user representation. In this paper, we present a new
approach to model users through recurrent neural networks by sequentially
processing consumed items, represented by any type of embeddings and other
context features. This way we obtain semantically rich user representations,
which capture a user's musical taste over time. Our experimental analysis on
large-scale user data shows that our model can be used to predict future songs
a user will likely listen to, both in the short and long term.Comment: Author pre-print version, 20 pages, 6 figures, 4 table
ENSA dataset: a dataset of songs by non-superstar artists tested with an emotional analysis based on time-series
This paper presents a novel dataset of songs by non-superstar artists in which a set of musical data is collected, identifying for each song its musical structure, and the emotional perception of the artist through a categorical emotional labeling process. The generation of this preliminary dataset is motivated by the existence of biases that have been detected in the analysis of the most used datasets in the field of emotion-based music recommendation. This new dataset contains 234 min of audio and 60 complete and labeled songs. In addition, an emotional analysis is carried out based on the representation of dynamic emotional perception through a time-series approach, in which the similarity values generated by the dynamic time warping (DTW) algorithm are analyzed and then used to implement a clustering process with the K-means algorithm. In the same way, clustering is also implemented with a Uniform Manifold Approximation and Projection (UMAP) technique, which is a manifold learning and dimension reduction algorithm. The algorithm HDBSCAN is applied for determining the optimal number of clusters. The results obtained from the different clustering strategies are compared and, in a preliminary analysis, a significant consistency is found between them. With the findings and experimental results obtained, a discussion is presented highlighting the importance of working with complete songs, preferably with a well-defined musical structure, considering the emotional variation that characterizes a song during the listening experience, in which the intensity of the emotion usually changes between verse, bridge, and chorus
Text-based Sentiment Analysis and Music Emotion Recognition
Nowadays, with the expansion of social media, large amounts of user-generated
texts like tweets, blog posts or product reviews are shared online. Sentiment polarity
analysis of such texts has become highly attractive and is utilized in recommender
systems, market predictions, business intelligence and more. We also witness deep
learning techniques becoming top performers on those types of tasks. There are
however several problems that need to be solved for efficient use of deep neural
networks on text mining and text polarity analysis.
First of all, deep neural networks are data hungry. They need to be fed with
datasets that are big in size, cleaned and preprocessed as well as properly labeled.
Second, the modern natural language processing concept of word embeddings as a
dense and distributed text feature representation solves sparsity and dimensionality
problems of the traditional bag-of-words model. Still, there are various uncertainties
regarding the use of word vectors: should they be generated from the same dataset
that is used to train the model or it is better to source them from big and popular
collections that work as generic text feature representations? Third, it is not easy for
practitioners to find a simple and highly effective deep learning setup for various
document lengths and types. Recurrent neural networks are weak with longer texts
and optimal convolution-pooling combinations are not easily conceived. It is thus
convenient to have generic neural network architectures that are effective and can
adapt to various texts, encapsulating much of design complexity.
This thesis addresses the above problems to provide methodological and practical
insights for utilizing neural networks on sentiment analysis of texts and achieving
state of the art results. Regarding the first problem, the effectiveness of various
crowdsourcing alternatives is explored and two medium-sized and emotion-labeled
song datasets are created utilizing social tags. One of the research interests of Telecom
Italia was the exploration of relations between music emotional stimulation and
driving style. Consequently, a context-aware music recommender system that aims
to enhance driving comfort and safety was also designed. To address the second
problem, a series of experiments with large text collections of various contents and
domains were conducted. Word embeddings of different parameters were exercised
and results revealed that their quality is influenced (mostly but not only) by the
size of texts they were created from. When working with small text datasets, it is
thus important to source word features from popular and generic word embedding
collections. Regarding the third problem, a series of experiments involving convolutional
and max-pooling neural layers were conducted. Various patterns relating
text properties and network parameters with optimal classification accuracy were
observed. Combining convolutions of words, bigrams, and trigrams with regional
max-pooling layers in a couple of stacks produced the best results. The derived
architecture achieves competitive performance on sentiment polarity analysis of
movie, business and product reviews.
Given that labeled data are becoming the bottleneck of the current deep learning
systems, a future research direction could be the exploration of various data programming
possibilities for constructing even bigger labeled datasets. Investigation
of feature-level or decision-level ensemble techniques in the context of deep neural
networks could also be fruitful. Different feature types do usually represent complementary
characteristics of data. Combining word embedding and traditional text
features or utilizing recurrent networks on document splits and then aggregating the
predictions could further increase prediction accuracy of such models
Exploiting sparsity to build efficient kernel based collaborative filtering for top-N item recommendation
The increasing availability of implicit feedback datasets has raised the interest in developing effective collaborative filtering techniques able to deal asymmetrically with unambiguous positive feedback and ambiguous negative feedback. In this paper, we propose a principled kernel-based collaborative filtering method for top-N item recommendation with implicit feedback. We present an efficient implementation using the linear kernel, and we show how to generalize it to kernels of the dot product family preserving the efficiency. We also investigate on the elements which influence the sparsity of a standard cosine kernel. This analysis shows that the sparsity of the kernel strongly depends on the properties of the dataset, in particular on the long tail distribution. We compare our method with state-of-the-art algorithms achieving good results both in terms of efficiency and effectiveness
- …