8 research outputs found
BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback
In this paper, we study the problem of safe online learning to re-rank, where
user feedback is used to improve the quality of displayed lists. Learning to
rank has traditionally been studied in two settings. In the offline setting,
rankers are typically learned from relevance labels created by judges. This
approach has generally become standard in industrial applications of ranking,
such as search. However, this approach lacks exploration and thus is limited by
the information content of the offline training data. In the online setting, an
algorithm can experiment with lists and learn from feedback on them in a
sequential fashion. Bandit algorithms are well-suited for this setting but they
tend to learn user preferences from scratch, which results in a high initial
cost of exploration. This poses an additional challenge of safe exploration in
ranked lists. We propose BubbleRank, a bandit algorithm for safe re-ranking
that combines the strengths of both the offline and online settings. The
algorithm starts with an initial base list and improves it online by gradually
exchanging higher-ranked less attractive items for lower-ranked more attractive
items. We prove an upper bound on the n-step regret of BubbleRank that degrades
gracefully with the quality of the initial base list. Our theoretical findings
are supported by extensive experiments on a large-scale real-world click
dataset
Position Bias Estimation for Unbiased Learning-to-Rank in eCommerce Search
The Unbiased Learning-to-Rank framework has been recently proposed as a
general approach to systematically remove biases, such as position bias, from
learning-to-rank models. The method takes two steps - estimating click
propensities and using them to train unbiased models. Most common methods
proposed in the literature for estimating propensities involve some degree of
intervention in the live search engine. An alternative approach proposed
recently uses an Expectation Maximization (EM) algorithm to estimate
propensities by using ranking features for estimating relevances. In this work
we propose a novel method to directly estimate propensities which does not use
any intervention in live search or rely on modeling relevance. Rather, we take
advantage of the fact that the same query-document pair may naturally change
ranks over time. This typically occurs for eCommerce search because of change
of popularity of items over time, existence of time dependent ranking features,
or addition or removal of items to the index (an item getting sold or a new
item being listed). However, our method is general and can be applied to any
search engine for which the rank of the same document may naturally change over
time for the same query. We derive a simple likelihood function that depends on
propensities only, and by maximizing the likelihood we are able to get
estimates of the propensities. We apply this method to eBay search data to
estimate click propensities for web and mobile search and compare these with
estimates using the EM method. We also use simulated data to show that the
method gives reliable estimates of the "true" simulated propensities. Finally,
we train an unbiased learning-to-rank model for eBay search using the estimated
propensities and show that it outperforms both baselines - one without position
bias correction and one with position bias correction using the EM method.Comment: 10 pages, 3 figure
Implications of Computational Cognitive Models for Information Retrieval
This dissertation explores the implications of computational cognitive modeling for information retrieval. The parallel between information retrieval and human memory is that the goal of an information retrieval system is to find the set of documents most relevant to the query whereas the goal for the human memory system is to access the relevance of items stored in memory given a memory probe (Steyvers & Griffiths, 2010).
The two major topics of this dissertation are desirability and information scent. Desirability is the context independent probability of an item receiving attention (Recker & Pitkow, 1996). Desirability has been widely utilized in numerous experiments to model the probability that a given memory item would be retrieved (Anderson, 2007). Information scent is a context dependent measure defined as the utility of an information item (Pirolli & Card, 1996b). Information scent has been widely utilized to predict the memory item that would be retrieved given a probe (Anderson, 2007) and to predict the browsing behavior of humans (Pirolli & Card, 1996b).
In this dissertation, I proposed the theory that desirability observed in human memory is caused by preferential attachment in networks. Additionally, I showed that documents accessed in large repositories mirror the observed statistical properties in human memory and that these properties can be used to improve document ranking. Finally, I showed that the combination of information scent and desirability improves document ranking over existing well-established approaches