11 research outputs found
Quick Lists: Enriched Playlist Embeddings for Future Playlist Recommendation
Recommending playlists to users in the context of a digital music service is
a difficult task because a playlist is often more than the mere sum of its
parts. We present a novel method for generating playlist embeddings that are
invariant to playlist length and sensitive to local and global track ordering.
The embeddings also capture information about playlist sequencing, and are
enriched with side information about the playlist user. We show that these
embeddings are useful for generating next-best playlist recommendations, and
that side information can be used for the cold start problem
Analysis of nutrition data by means of a matrix factorization method
We present a factorization framework to analyze the data of a regression learning task with two peculiarities. First, inputs can be split into two parts that represent semantically significant entities. Second, the performance of regressors is very low. The basic idea of the approach presented here is to try to learn the ordering relations of the target variable instead of its exact value. Each part of the input is mapped into a common Euclidean space in such a way that the distance in the common space is the representation of the interaction of both parts of the input. The factorization approach obtains reliable models from which it is possible to compute a ranking of the features according to their responsibility in the variation of the target variable. Additionally, the Euclidean representation of data provides a visualization where metric properties have a clear semantics. We illustrate the approach with a case study: the analysis of a dataset about the variations of Body Mass Index for Age of children after a Food Aid Program deployed in poor rural communities in Southern MĂ©xico. In this case, the two parts of inputs are the vectorial representation of children and their diets. In addition to discovering latent information, the mapping of inputs allows us to visualize children and diets in a common metric spac
Word2vec embeddings for playlist recommendation
Treballs Finals de Grau d'Enginyeria InformĂ tica, Facultat de MatemĂ tiques, Universitat de Barcelona, Any: 2018, Director: Santi SeguĂ Mesquida i Guillem Pascual i Guinovart[en] We present an ML approach to musical playlist recommendation. Using the algorithm Word2Vec, a shallow two-layer neural network trained to reconstruct linguistic context of words, we have created several embeddings using tracks and playlist titles as words of an artificial vocabulary. Some experiments with different trade-offs between the diversity and the popularity of songs in playlists are analyzed and discussed. By means of combining a tracks embedding and a titles embedding our recommender has reached 19 percent of accuracy. Our model has been created and trained using the MPD (million playlists dataset) given by Spotify as part of the RecSys Challenge 2018
Support the Underground: Characteristics of Beyond-Mainstream Music Listeners
Music recommender systems have become an integral part of music streaming
services such as Spotify and Last.fm to assist users navigating the extensive
music collections offered by them. However, while music listeners interested in
mainstream music are traditionally served well by music recommender systems,
users interested in music beyond the mainstream (i.e., non-popular music)
rarely receive relevant recommendations. In this paper, we study the
characteristics of beyond-mainstream music and music listeners and analyze to
what extent these characteristics impact the quality of music recommendations
provided. Therefore, we create a novel dataset consisting of Last.fm listening
histories of several thousand beyond-mainstream music listeners, which we
enrich with additional metadata describing music tracks and music listeners.
Our analysis of this dataset shows four subgroups within the group of
beyond-mainstream music listeners that differ not only with respect to their
preferred music but also with their demographic characteristics. Furthermore,
we evaluate the quality of music recommendations that these subgroups are
provided with four different recommendation algorithms where we find
significant differences between the groups. Specifically, our results show a
positive correlation between a subgroup's openness towards music listened to by
members of other subgroups and recommendation accuracy. We believe that our
findings provide valuable insights for developing improved user models and
recommendation approaches to better serve beyond-mainstream music listeners.Comment: Accepted for publication in EPJ Data Science - link to published
version will be adde
Learning Contextualized Semantics from Co-occurring Terms via a Siamese Architecture
One of the biggest challenges in Multimedia information retrieval and
understanding is to bridge the semantic gap by properly modeling concept
semantics in context. The presence of out of vocabulary (OOV) concepts
exacerbates this difficulty. To address the semantic gap issues, we formulate a
problem on learning contextualized semantics from descriptive terms and propose
a novel Siamese architecture to model the contextualized semantics from
descriptive terms. By means of pattern aggregation and probabilistic topic
models, our Siamese architecture captures contextualized semantics from the
co-occurring descriptive terms via unsupervised learning, which leads to a
concept embedding space of the terms in context. Furthermore, the co-occurring
OOV concepts can be easily represented in the learnt concept embedding space.
The main properties of the concept embedding space are demonstrated via
visualization. Using various settings in semantic priming, we have carried out
a thorough evaluation by comparing our approach to a number of state-of-the-art
methods on six annotation corpora in different domains, i.e., MagTag5K, CAL500
and Million Song Dataset in the music domain as well as Corel5K, LabelMe and
SUNDatabase in the image domain. Experimental results on semantic priming
suggest that our approach outperforms those state-of-the-art methods
considerably in various aspects