69,291 research outputs found

    How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility

    Full text link
    Recommendation systems are ubiquitous and impact many domains; they have the potential to influence product consumption, individuals' perceptions of the world, and life-altering decisions. These systems are often evaluated or trained with data from users already exposed to algorithmic recommendations; this creates a pernicious feedback loop. Using simulations, we demonstrate how using data confounded in this way homogenizes user behavior without increasing utility

    GPU Based Path Integral Control with Learned Dynamics

    Full text link
    We present an algorithm which combines recent advances in model based path integral control with machine learning approaches to learning forward dynamics models. We take advantage of the parallel computing power of a GPU to quickly take a massive number of samples from a learned probabilistic dynamics model, which we use to approximate the path integral form of the optimal control. The resulting algorithm runs in a receding-horizon fashion in realtime, and is subject to no restrictive assumptions about costs, constraints, or dynamics. A simple change to the path integral control formulation allows the algorithm to take model uncertainty into account during planning, and we demonstrate its performance on a quadrotor navigation task. In addition to this novel adaptation of path integral control, this is the first time that a receding-horizon implementation of iterative path integral control has been run on a real system.Comment: 6 pages, NIPS 2014 - Autonomously Learning Robots Worksho

    Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling

    Full text link
    We evaluate the impact of probabilistically-constructed digital identity data collected from Sep. to Dec. 2017 (approx.), in the context of Lookalike-targeted campaigns. The backbone of this study is a large set of probabilistically-constructed "identities", represented as small bags of cookies and mobile ad identifiers with associated metadata, that are likely all owned by the same underlying user. The identity data allows to generate "identity-based", rather than "identifier-based", user models, giving a fuller picture of the interests of the users underlying the identifiers. We employ off-policy techniques to evaluate the potential of identity-powered lookalike models without incurring the risk of allowing untested models to direct large amounts of ad spend or the large cost of performing A/B tests. We add to historical work on off-policy evaluation by noting a significant type of "finite-sample bias" that occurs for studies combining modestly-sized datasets and evaluation metrics involving rare events (e.g., conversions). We illustrate this bias using a simulation study that later informs the handling of inverse propensity weights in our analyses on real data. We demonstrate significant lift in identity-powered lookalikes versus an identity-ignorant baseline: on average ~70% lift in conversion rate. This rises to factors of ~(4-32)x for identifiers having little data themselves, but that can be inferred to belong to users with substantial data to aggregate across identifiers. This implies that identity-powered user modeling is especially important in the context of identifiers having very short lifespans (i.e., frequently churned cookies). Our work motivates and informs the use of probabilistically-constructed identities in marketing. It also deepens the canon of examples in which off-policy learning has been employed to evaluate the complex systems of the internet economy.Comment: Accepted by WSDM 201

    A Probabilistic Model for the Cold-Start Problem in Rating Prediction using Click Data

    Full text link
    One of the most efficient methods in collaborative filtering is matrix factorization, which finds the latent vector representations of users and items based on the ratings of users to items. However, a matrix factorization based algorithm suffers from the cold-start problem: it cannot find latent vectors for items to which previous ratings are not available. This paper utilizes click data, which can be collected in abundance, to address the cold-start problem. We propose a probabilistic item embedding model that learns item representations from click data, and a model named EMB-MF, that connects it with a probabilistic matrix factorization for rating prediction. The experiments on three real-world datasets demonstrate that the proposed model is not only effective in recommending items with no previous ratings, but also outperforms competing methods, especially when the data is very sparse.Comment: ICONIP 201

    Reduced perplexity: Uncertainty measures without entropy

    Full text link
    Conference paper presented at Recent Advances in Info-Metrics, Washington, DC, 2014. Under review for a book chapter in "Recent innovations in info-metrics: a cross-disciplinary perspective on information and information processing" by Oxford University Press.A simple, intuitive approach to the assessment of probabilistic inferences is introduced. The Shannon information metrics are translated to the probability domain. The translation shows that the negative logarithmic score and the geometric mean are equivalent measures of the accuracy of a probabilistic inference. Thus there is both a quantitative reduction in perplexity as good inference algorithms reduce the uncertainty and a qualitative reduction due to the increased clarity between the original set of inferences and their average, the geometric mean. Further insight is provided by showing that the Renyi and Tsallis entropy functions translated to the probability domain are both the weighted generalized mean of the distribution. The generalized mean of probabilistic inferences forms a Risk Profile of the performance. The arithmetic mean is used to measure the decisiveness, while the -2/3 mean is used to measure the robustness

    MBMF: Model-Based Priors for Model-Free Reinforcement Learning

    Full text link
    Reinforcement Learning is divided in two main paradigms: model-free and model-based. Each of these two paradigms has strengths and limitations, and has been successfully applied to real world domains that are appropriate to its corresponding strengths. In this paper, we present a new approach aimed at bridging the gap between these two paradigms. We aim to take the best of the two paradigms and combine them in an approach that is at the same time data-efficient and cost-savvy. We do so by learning a probabilistic dynamics model and leveraging it as a prior for the intertwined model-free optimization. As a result, our approach can exploit the generality and structure of the dynamics model, but is also capable of ignoring its inevitable inaccuracies, by directly incorporating the evidence provided by the direct observation of the cost. Preliminary results demonstrate that our approach outperforms purely model-based and model-free approaches, as well as the approach of simply switching from a model-based to a model-free setting.Comment: After we submitted the paper for consideration in CoRL 2017 we found a paper published in the recent past with a similar method (see related work for a discussion). Considering the similarities between the two papers, we have decided to retract our paper from CoRL 201
    • …
    corecore