914 research outputs found
Counterfactual Estimation and Optimization of Click Metrics for Search Engines
Optimizing an interactive system against a predefined online metric is
particularly challenging, when the metric is computed from user feedback such
as clicks and payments. The key challenge is the counterfactual nature: in the
case of Web search, any change to a component of the search engine may result
in a different search result page for the same query, but we normally cannot
infer reliably from search log how users would react to the new result page.
Consequently, it appears impossible to accurately estimate online metrics that
depend on user feedback, unless the new engine is run to serve users and
compared with a baseline in an A/B test. This approach, while valid and
successful, is unfortunately expensive and time-consuming. In this paper, we
propose to address this problem using causal inference techniques, under the
contextual-bandit framework. This approach effectively allows one to run
(potentially infinitely) many A/B tests offline from search log, making it
possible to estimate and optimize online metrics quickly and inexpensively.
Focusing on an important component in a commercial search engine, we show how
these ideas can be instantiated and applied, and obtain very promising results
that suggest the wide applicability of these techniques
A Theoretical Analysis of Two-Stage Recommendation for Cold-Start Collaborative Filtering
In this paper, we present a theoretical framework for tackling the cold-start
collaborative filtering problem, where unknown targets (items or users) keep
coming to the system, and there is a limited number of resources (users or
items) that can be allocated and related to them. The solution requires a
trade-off between exploitation and exploration as with the limited
recommendation opportunities, we need to, on one hand, allocate the most
relevant resources right away, but, on the other hand, it is also necessary to
allocate resources that are useful for learning the target's properties in
order to recommend more relevant ones in the future. In this paper, we study a
simple two-stage recommendation combining a sequential and a batch solution
together. We first model the problem with the partially observable Markov
decision process (POMDP) and provide an exact solution. Then, through an
in-depth analysis over the POMDP value iteration solution, we identify that an
exact solution can be abstracted as selecting resources that are not only
highly relevant to the target according to the initial-stage information, but
also highly correlated, either positively or negatively, with other potential
resources for the next stage. With this finding, we propose an approximate
solution to ease the intractability of the exact solution. Our initial results
on synthetic data and the Movie Lens 100K dataset confirm the performance gains
of our theoretical development and analysis
- …