6 research outputs found
The Music Streaming Sessions Dataset
At the core of many important machine learning problems faced by online
streaming services is a need to model how users interact with the content.
These problems can often be reduced to a combination of 1) sequentially
recommending items to the user, and 2) exploiting the user's interactions with
the items as feedback for the machine learning model. Unfortunately, there are
no public datasets currently available that enable researchers to explore this
topic. In order to spur that research, we release the Music Streaming Sessions
Dataset (MSSD), which consists of approximately 150 million listening sessions
and associated user actions. Furthermore, we provide audio features and
metadata for the approximately 3.7 million unique tracks referred to in the
logs. This is the largest collection of such track metadata currently available
to the public. This dataset enables research on important problems including
how to model user listening and interaction behaviour in streaming, as well as
Music Information Retrieval (MIR), and session-based sequential
recommendations.Comment: 3 pages, introducing a new large scale datase
Investigating bias in Music Recommender Systems
Music Recommender Systems (MRS) are software applications that provide personalized music recommendations based on user preferences and listening history. They analyze data to suggest music that aligns with individual tastes, enhancing the music discovery experience. This thesis aims to investigate the influence of record labels across different music recommendation datasets and evaluate their impact on recommender systems. Additionally, it seeks to expand the scope and experimentation of prior research on bias within feedback loops of MRS. To study their effect, the datasets are preprocessed and fed into a multi-stage web crawler that retrieves record label information for individual albums as well as an assignment to a major record company (Universal, Sony, Warner) or independent. This crawler is used to enrich our dataset collection. Based on the additional information, we can show different characteristics and identify particular biases in their user-generated music collections of playlists and listening profiles. Moreover, recommender system experiments are conducted, presenting results of feedback loop simulations, where the stability of record label distribution in longitudinal recommendations are studied. All findings and gathered record label information are made publicly available to the research community.Els Sistemes de Recomanaci贸 Musical (MRS) s贸n aplicacions de software que proporcionen recomanacions de m煤sica personalitzades basades en les prefer猫ncies i el hist貌ric d'escolta de l'usuari. Analitzen dades per suggerir m煤sica que s'ajusti als gustos individuals, millorant aix铆 l'experi猫ncia de descobriment musical. Aquesta tesi t茅 com a objectiu investigar la influ猫ncia de les discogr脿fiques en diferents conjunts de dades de recomanaci贸 musical i avaluar el seu impacte en els sistemes de recomanaci贸. A m茅s, busca ampliar l'abast i l'experimentaci贸 de recerques pr猫vies sobre biaixos en els bucles de retroalimentaci贸 dels MRS. Per estudiar el seu efecte, els conjunts de dades es pre-processen i s'insereixen a un rastrejador web de diverses etapes que recopila informaci贸 sobre les discogr脿fiques dels 脿lbums individuals, aix铆 com la seva classificaci贸 en una discogr脿fica principal (Universal, Sony, Warner) o independent. Aquest rastrejador s'utilitza per enriquir la nostra col路lecci贸 de dades. Basant-nos en la informaci贸 addicional, podem mostrar diferents caracter铆stiques i identificar biaixos particulars en les col路leccions de m煤sica generades pels usuaris, com ara llistes de reproducci贸 i perfils d'escolta. A m茅s, es fan experiments en un entorn simulat de recomanacions, presentant els primers resultats de la simulaci贸 de bucles de retroalimentaci贸 on s'estudia l'estabilitat de la distribuci贸 de segells discogr脿fics en recomanacions longitudinals. Totes les troballes i la informaci贸 recopilada de segells discogr脿fics es posa a la disposici贸 del p煤blic per a la comunitat investigadora
Leveraging Microblogs for Spatiotemporal Music Information Retrieval
Abstract. We present results of text data mining experiments for music retrieval, analyzing microblogs gathered from November2011 to September 2012 to infer music listening patterns all around the world. We assess relationships between particular music preferences and spatial properties, such as month, weekday, and country, and the temporal stability of listening activities. The findings of our study will help improve music retrieval and recommendation systems in that it will allow to incorporate geospatial and cultural information into models for music retrieval, which has not been looked into before.