312 research outputs found

    Deep Exploration for Recommendation Systems

    Full text link
    Modern recommendation systems ought to benefit by probing for and learning from delayed feedback. Research has tended to focus on learning from a user's response to a single recommendation. Such work, which leverages methods of supervised and bandit learning, forgoes learning from the user's subsequent behavior. Where past work has aimed to learn from subsequent behavior, there has been a lack of effective methods for probing to elicit informative delayed feedback. Effective exploration through probing for delayed feedback becomes particularly challenging when rewards are sparse. To address this, we develop deep exploration methods for recommendation systems. In particular, we formulate recommendation as a sequential decision problem and demonstrate benefits of deep exploration over single-step exploration. Our experiments are carried out with high-fidelity industrial-grade simulators and establish large improvements over existing algorithms

    How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility

    Full text link
    Recommendation systems are ubiquitous and impact many domains; they have the potential to influence product consumption, individuals' perceptions of the world, and life-altering decisions. These systems are often evaluated or trained with data from users already exposed to algorithmic recommendations; this creates a pernicious feedback loop. Using simulations, we demonstrate how using data confounded in this way homogenizes user behavior without increasing utility

    Multi-List Recommendations for Personalizing Streaming Content

    Get PDF
    The decision behind choosing a recommender system that yields accurate recommendations yet allows users to explore more content has been a topic of research in the last decades. This work attempts to find a recommender system for TV 2 Play, a movie streaming platform, that would perform well on implicit feedback data and provide multi-lists as recommenda- tions. Several approaches are examined for suitability, and Collaborative Filtering and Multi- Armed Bandits are decided upon. The models for each approach are built using the pipeline utilized by TV 2 Play. The models are then compared in performance on several evaluation metrics in the first stage of offline testing, yielding Alternating Least Squares and Bayesian Personalized Ranking as the best-performing models. The second stage of offline testing includes testing the two models and their variants with the BM25 weighting scheme applied against each other. The unweighted Bayesian Personalized Ranking model has shown the highest user-centric metrics while maintaining relatively high recommendation-centric met- rics, which led to that model being tested in online settings against the algorithm currently used by TV 2 Play team. The online testing has revealed that our model underperforms compared to the TV 2 Play model when used on the kids’ page but produces equally good results on the movies page. The results can be attributed to the differences in behavioral content consumption patterns between users.Master's Thesis in InformaticsINF399MAMN-PROGMAMN-IN
    • …
    corecore