914 research outputs found

    Counterfactual Estimation and Optimization of Click Metrics for Search Engines

    Full text link
    Optimizing an interactive system against a predefined online metric is particularly challenging, when the metric is computed from user feedback such as clicks and payments. The key challenge is the counterfactual nature: in the case of Web search, any change to a component of the search engine may result in a different search result page for the same query, but we normally cannot infer reliably from search log how users would react to the new result page. Consequently, it appears impossible to accurately estimate online metrics that depend on user feedback, unless the new engine is run to serve users and compared with a baseline in an A/B test. This approach, while valid and successful, is unfortunately expensive and time-consuming. In this paper, we propose to address this problem using causal inference techniques, under the contextual-bandit framework. This approach effectively allows one to run (potentially infinitely) many A/B tests offline from search log, making it possible to estimate and optimize online metrics quickly and inexpensively. Focusing on an important component in a commercial search engine, we show how these ideas can be instantiated and applied, and obtain very promising results that suggest the wide applicability of these techniques

    A Theoretical Analysis of Two-Stage Recommendation for Cold-Start Collaborative Filtering

    Full text link
    In this paper, we present a theoretical framework for tackling the cold-start collaborative filtering problem, where unknown targets (items or users) keep coming to the system, and there is a limited number of resources (users or items) that can be allocated and related to them. The solution requires a trade-off between exploitation and exploration as with the limited recommendation opportunities, we need to, on one hand, allocate the most relevant resources right away, but, on the other hand, it is also necessary to allocate resources that are useful for learning the target's properties in order to recommend more relevant ones in the future. In this paper, we study a simple two-stage recommendation combining a sequential and a batch solution together. We first model the problem with the partially observable Markov decision process (POMDP) and provide an exact solution. Then, through an in-depth analysis over the POMDP value iteration solution, we identify that an exact solution can be abstracted as selecting resources that are not only highly relevant to the target according to the initial-stage information, but also highly correlated, either positively or negatively, with other potential resources for the next stage. With this finding, we propose an approximate solution to ease the intractability of the exact solution. Our initial results on synthetic data and the Movie Lens 100K dataset confirm the performance gains of our theoretical development and analysis
    • …
    corecore