119 research outputs found

    Revisit Behavior in Social Media: The Phoenix-R Model and Discoveries

    Full text link
    How many listens will an artist receive on a online radio? How about plays on a YouTube video? How many of these visits are new or returning users? Modeling and mining popularity dynamics of social activity has important implications for researchers, content creators and providers. We here investigate the effect of revisits (successive visits from a single user) on content popularity. Using four datasets of social activity, with up to tens of millions media objects (e.g., YouTube videos, Twitter hashtags or LastFM artists), we show the effect of revisits in the popularity evolution of such objects. Secondly, we propose the Phoenix-R model which captures the popularity dynamics of individual objects. Phoenix-R has the desired properties of being: (1) parsimonious, being based on the minimum description length principle, and achieving lower root mean squared error than state-of-the-art baselines; (2) applicable, the model is effective for predicting future popularity values of objects.Comment: To appear on European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 201

    Investigating bias in Music Recommender Systems

    Get PDF
    Music Recommender Systems (MRS) are software applications that provide personalized music recommendations based on user preferences and listening history. They analyze data to suggest music that aligns with individual tastes, enhancing the music discovery experience. This thesis aims to investigate the influence of record labels across different music recommendation datasets and evaluate their impact on recommender systems. Additionally, it seeks to expand the scope and experimentation of prior research on bias within feedback loops of MRS. To study their effect, the datasets are preprocessed and fed into a multi-stage web crawler that retrieves record label information for individual albums as well as an assignment to a major record company (Universal, Sony, Warner) or independent. This crawler is used to enrich our dataset collection. Based on the additional information, we can show different characteristics and identify particular biases in their user-generated music collections of playlists and listening profiles. Moreover, recommender system experiments are conducted, presenting results of feedback loop simulations, where the stability of record label distribution in longitudinal recommendations are studied. All findings and gathered record label information are made publicly available to the research community.Els Sistemes de Recomanació Musical (MRS) són aplicacions de software que proporcionen recomanacions de música personalitzades basades en les preferències i el històric d'escolta de l'usuari. Analitzen dades per suggerir música que s'ajusti als gustos individuals, millorant així l'experiència de descobriment musical. Aquesta tesi té com a objectiu investigar la influència de les discogràfiques en diferents conjunts de dades de recomanació musical i avaluar el seu impacte en els sistemes de recomanació. A més, busca ampliar l'abast i l'experimentació de recerques prèvies sobre biaixos en els bucles de retroalimentació dels MRS. Per estudiar el seu efecte, els conjunts de dades es pre-processen i s'insereixen a un rastrejador web de diverses etapes que recopila informació sobre les discogràfiques dels àlbums individuals, així com la seva classificació en una discogràfica principal (Universal, Sony, Warner) o independent. Aquest rastrejador s'utilitza per enriquir la nostra col·lecció de dades. Basant-nos en la informació addicional, podem mostrar diferents característiques i identificar biaixos particulars en les col·leccions de música generades pels usuaris, com ara llistes de reproducció i perfils d'escolta. A més, es fan experiments en un entorn simulat de recomanacions, presentant els primers resultats de la simulació de bucles de retroalimentació on s'estudia l'estabilitat de la distribució de segells discogràfics en recomanacions longitudinals. Totes les troballes i la informació recopilada de segells discogràfics es posa a la disposició del públic per a la comunitat investigadora

    User geospatial context for music recommendation in microblogs

    Get PDF
    Music information retrieval and music recommendation are seeing a paradigm shift towards methods that incorporate user context aspects. However, structured experiments on a standardized music dataset to investigate the effects of do-ing so are scarce. In this paper, we compare performance of various combinations of collaborative filtering and geospatial as well as cultural user models for the task of music recom-mendation. To this end, we propose a geospatial model that uses GPS coordinates and a cultural model that uses seman-tic locations (continent, country, and state of the user). We conduct experiments on a novel standardized music collec-tion, the “Million Musical Tweets Dataset ” of listing events extracted from microblogs. Overall, we find that modeling listeners ’ location via Gaussian mixture models and comput-ing similarities from these outperforms both cultural user models and collaborative filtering. Categories and Subject Descriptors Information systems [Information retrieval]: Music rec-ommendation; Human-centered computing [Collaborative and social computing]: Social medi

    Localized Events in Social Media Streams: Detection, Tracking, and Recommendation

    Get PDF
    From the recent proliferation of social media channels to the immense amount of user-generated content, an increasing interest in social media mining is currently being witnessed. Messages continuously posted via these channels report a broad range of topics from daily life to global and local events. As a consequence, this has opened new opportunities for mining event information crucial in many application domains, especially in increasing the situational awareness in critical scenarios. Interestingly, many of these messages are enriched with location information, due to the wide- spread of mobile devices and the recent advancements of today’s location acquisition techniques. This enables location-aware event mining, i.e., the detection and tracking of localized events. In this thesis, we propose novel frameworks and models that digest social media content for localized event detection, tracking, and recommendation. We first develop KeyPicker, a framework to extract and score event-related keywords in an online fashion, accounting for high levels of noise, temporal heterogeneity and outliers in the data. Then, LocEvent is proposed to incrementally detect and track events using a 4-stage procedure. That is, LocEvent receives the keywords extracted by KeyPicker, identifies local keywords, spatially clusters them, and finally scores the generated clusters. For each detected event, a set of descriptive keywords, a location, and a time interval are estimated at a fine-grained resolution. In addition to the sparsity of geo-tagged messages, people sometimes post about events far away from an event’s location. Such spatial problems are handled by novel spatial regularization techniques, namely, graph- and gazetteer-based regularization. To ensure scalability, we utilize a hierarchical spatial index in addition to a multi-stage filtering procedure that gradually suppresses noisy words and considers only event-related ones for complex spatial computations. As for recommendation applications, we propose an event recommender system built upon model-based collaborative filtering. Our model is able to suggest events to users, taking into account a number of contextual features including the social links between users, the topical similarities of events, and the spatio-temporal proximity between users and events. To realize this model, we employ and adapt matrix factorization, which allows for uncovering latent user-event patterns. Our proposed features contribute to directing the learning process towards recommendations that better suit the taste of users, in particular when new users have very sparse (or even no) event attendance history. To evaluate the effectiveness and efficiency of our proposed approaches, extensive comparative experiments are conducted using datasets collected from social media channels. Our analysis of the experimental results reveals the superiority and advantages of our frameworks over existing methods in terms of the relevancy and precision of the obtained results

    STREAMCUBE: Hierarchical spatio-temporal hashtag clustering for event exploration over the Twitter stream

    Full text link

    Investigating redundancy in emoji use : study on a twitter based corpus

    Get PDF
    In this paper we present an annotated corpus created with the aim of analyzing the informative behaviour of emoji – an issue of importance for sentiment analysis and natural language processing. The corpus consists of 2475 tweets all containing at least one emoji, which has been annotated using one of the three possible classes: Redundant, Non Redundant, and Non Redundant + POS. We explain how the corpus was collected, describe the annotation procedure and the interface developed for the task. We provide an analysis of the corpus, considering also possible predictive features, discuss the problematic aspects of the annotation, and suggest future improvements.peer-reviewe

    Platform pop:disentangling Spotify’s intermediary role in the music industry

    Get PDF
    It has been widely recognized that platforms utilize their editorial capacity to transform the industries they intermediate. In this paper, we examine the intermediary role of the leading audio streaming platform – Spotify – on the recorded music industry. Spotify is often called the ‘new radio’ for the influence it has on breaking songs and artists, and for the role it plays in music discovery and consumption. Our purpose is to determine whether Spotify is leveling the playing field or entrenching hierarchies between major labels and independent labels. We attempt to answer this question through a longitudinal analysis of content owners (major labels or ‘indies’) and formats (albums, tracks, or playlists) promoted by Spotify through its global Twitter account: @Spotify. As a carefully curated venue for corporate speech @Spotify provides a window into continuities and changes in Spotify’s corporate strategy. By using @Spotify as a proxy through which to track patterns of promotion between the years 2012 and 2018, this paper offers a novel empirical examination of how Spotify is shaping the consumption of music, and in turn the structure of the recording industry. In doing so, we provide evidence for speculating about the future of the recorded music industry in a platform era
    corecore