3,286 research outputs found

    The Music Streaming Sessions Dataset

    Full text link
    At the core of many important machine learning problems faced by online streaming services is a need to model how users interact with the content. These problems can often be reduced to a combination of 1) sequentially recommending items to the user, and 2) exploiting the user's interactions with the items as feedback for the machine learning model. Unfortunately, there are no public datasets currently available that enable researchers to explore this topic. In order to spur that research, we release the Music Streaming Sessions Dataset (MSSD), which consists of approximately 150 million listening sessions and associated user actions. Furthermore, we provide audio features and metadata for the approximately 3.7 million unique tracks referred to in the logs. This is the largest collection of such track metadata currently available to the public. This dataset enables research on important problems including how to model user listening and interaction behaviour in streaming, as well as Music Information Retrieval (MIR), and session-based sequential recommendations.Comment: 3 pages, introducing a new large scale datase

    Personalized Video Recommendation Using Rich Contents from Videos

    Full text link
    Video recommendation has become an essential way of helping people explore the massive videos and discover the ones that may be of interest to them. In the existing video recommender systems, the models make the recommendations based on the user-video interactions and single specific content features. When the specific content features are unavailable, the performance of the existing models will seriously deteriorate. Inspired by the fact that rich contents (e.g., text, audio, motion, and so on) exist in videos, in this paper, we explore how to use these rich contents to overcome the limitations caused by the unavailability of the specific ones. Specifically, we propose a novel general framework that incorporates arbitrary single content feature with user-video interactions, named as collaborative embedding regression (CER) model, to make effective video recommendation in both in-matrix and out-of-matrix scenarios. Our extensive experiments on two real-world large-scale datasets show that CER beats the existing recommender models with any single content feature and is more time efficient. In addition, we propose a priority-based late fusion (PRI) method to gain the benefit brought by the integrating the multiple content features. The corresponding experiment shows that PRI brings real performance improvement to the baseline and outperforms the existing fusion methods

    How to Retrain Recommender System? A Sequential Meta-Learning Method

    Full text link
    Practical recommender systems need be periodically retrained to refresh the model with new interaction data. To pursue high model fidelity, it is usually desirable to retrain the model on both historical and new data, since it can account for both long-term and short-term user preference. However, a full model retraining could be very time-consuming and memory-costly, especially when the scale of historical data is large. In this work, we study the model retraining mechanism for recommender systems, a topic of high practical values but has been relatively little explored in the research community. Our first belief is that retraining the model on historical data is unnecessary, since the model has been trained on it before. Nevertheless, normal training on new data only may easily cause overfitting and forgetting issues, since the new data is of a smaller scale and contains fewer information on long-term user preference. To address this dilemma, we propose a new training method, aiming to abandon the historical data during retraining through learning to transfer the past training experience. Specifically, we design a neural network-based transfer component, which transforms the old model to a new model that is tailored for future recommendations. To learn the transfer component well, we optimize the "future performance" -- i.e., the recommendation accuracy evaluated in the next time period. Our Sequential Meta-Learning(SML) method offers a general training paradigm that is applicable to any differentiable model. We demonstrate SML on matrix factorization and conduct experiments on two real-world datasets. Empirical results show that SML not only achieves significant speed-up, but also outperforms the full model retraining in recommendation accuracy, validating the effectiveness of our proposals. We release our codes at: https://github.com/zyang1580/SML.Comment: Appear in SIGIR 202

    Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks

    Full text link
    Session-based recommendations are highly relevant in many modern on-line services (e.g. e-commerce, video streaming) and recommendation settings. Recently, Recurrent Neural Networks have been shown to perform very well in session-based settings. While in many session-based recommendation domains user identifiers are hard to come by, there are also domains in which user profiles are readily available. We propose a seamless way to personalize RNN models with cross-session information transfer and devise a Hierarchical RNN model that relays end evolves latent hidden states of the RNNs across user sessions. Results on two industry datasets show large improvements over the session-only RNNs

    Diverse personalized recommendations with uncertainty from implicit preference data with the Bayesian Mallows Model

    Full text link
    Clicking data, which exists in abundance and contains objective user preference information, is widely used to produce personalized recommendations in web-based applications. Current popular recommendation algorithms, typically based on matrix factorizations, often have high accuracy and achieve good clickthrough rates. However, diversity of the recommended items, which can greatly enhance user experiences, is often overlooked. Moreover, most algorithms do not produce interpretable uncertainty quantifications of the recommendations. In this work, we propose the Bayesian Mallows for Clicking Data (BMCD) method, which augments clicking data into compatible full ranking vectors by enforcing all the clicked items to be top-ranked. User preferences are learned using a Mallows ranking model. Bayesian inference leads to interpretable uncertainties of each individual recommendation, and we also propose a method to make personalized recommendations based on such uncertainties. With a simulation study and a real life data example, we demonstrate that compared to state-of-the-art matrix factorization, BMCD makes personalized recommendations with similar accuracy, while achieving much higher level of diversity, and producing interpretable and actionable uncertainty estimation.Comment: 27 page
    corecore