1,046 research outputs found
Query Chains: Learning to Rank from Implicit Feedback
This paper presents a novel approach for using clickthrough data to learn
ranked retrieval functions for web search results. We observe that users
searching the web often perform a sequence, or chain, of queries with a similar
information need. Using query chains, we generate new types of preference
judgments from search engine logs, thus taking advantage of user intelligence
in reformulating queries. To validate our method we perform a controlled user
study comparing generated preference judgments to explicit relevance judgments.
We also implemented a real-world search engine to test our approach, using a
modified ranking SVM to learn an improved ranking function from preference
data. Our results demonstrate significant improvements in the ranking given by
the search engine. The learned rankings outperform both a static ranking
function, as well as one trained without considering query chains.Comment: 10 page
Use of implicit graph for recommending relevant videos: a simulated evaluation
In this paper, we propose a model for exploiting community based usage information for video retrieval. Implicit usage information from a pool of past users could be a valuable source to address the difficulties caused due to the semantic gap problem. We propose a graph-based implicit feedback model in which all the usage information can be represented. A number of recommendation algorithms were suggested and experimented. A simulated user evaluation is conducted on the TREC VID collection and the results are presented. Analyzing the results we found some common characteristics on the best performing algorithms, which could indicate the best way of exploiting this type of usage information
Estimating Position Bias without Intrusive Interventions
Presentation bias is one of the key challenges when learning from implicit
feedback in search engines, as it confounds the relevance signal. While it was
recently shown how counterfactual learning-to-rank (LTR) approaches
\cite{Joachims/etal/17a} can provably overcome presentation bias when
observation propensities are known, it remains to show how to effectively
estimate these propensities. In this paper, we propose the first method for
producing consistent propensity estimates without manual relevance judgments,
disruptive interventions, or restrictive relevance modeling assumptions. First,
we show how to harvest a specific type of intervention data from historic
feedback logs of multiple different ranking functions, and show that this data
is sufficient for consistent propensity estimation in the position-based model.
Second, we propose a new extremum estimator that makes effective use of this
data. In an empirical evaluation, we find that the new estimator provides
superior propensity estimates in two real-world systems -- Arxiv Full-text
Search and Google Drive Search. Beyond these two points, we find that the
method is robust to a wide range of settings in simulation studies
The Economics of Internet Search
This lecture provides an introduction to the economics of Internet search engines. After a brief review of the historical development of the technology and the industry, I describe some of the economic features of the auction system used for displaying ads. It turns out that some relatively simple economic models provide significant insight into the operation of these auctions. In particular, the classical theory of two-sided matching markets turns out to be very useful in this context.
Sensitive and Scalable Online Evaluation with Theoretical Guarantees
Multileaved comparison methods generalize interleaved comparison methods to
provide a scalable approach for comparing ranking systems based on regular user
interactions. Such methods enable the increasingly rapid research and
development of search engines. However, existing multileaved comparison methods
that provide reliable outcomes do so by degrading the user experience during
evaluation. Conversely, current multileaved comparison methods that maintain
the user experience cannot guarantee correctness. Our contribution is two-fold.
First, we propose a theoretical framework for systematically comparing
multileaved comparison methods using the notions of considerateness, which
concerns maintaining the user experience, and fidelity, which concerns reliable
correct outcomes. Second, we introduce a novel multileaved comparison method,
Pairwise Preference Multileaving (PPM), that performs comparisons based on
document-pair preferences, and prove that it is considerate and has fidelity.
We show empirically that, compared to previous multileaved comparison methods,
PPM is more sensitive to user preferences and scalable with the number of
rankers being compared.Comment: CIKM 2017, Proceedings of the 2017 ACM on Conference on Information
and Knowledge Managemen
Reliability and effectiveness of clickthrough data for automatic image annotation
Automatic image annotation using supervised learning is performed by concept classifiers trained on labelled example images. This work proposes the use of clickthrough data collected from search logs as a source for the automatic generation of concept training data, thus avoiding the expensive manual annotation effort. We investigate and evaluate this approach using a collection of 97,628 photographic images. The results indicate that the contribution of search log based training data is positive despite their inherent noise; in particular, the combination of manual and automatically generated training data outperforms the use of manual data alone. It is therefore possible to use clickthrough data to perform large-scale image annotation with little manual annotation effort or, depending on performance, using only the automatically generated training data. An extensive presentation of the experimental results and the accompanying data can be accessed at http://olympus.ee.auth.gr/~diou/civr2009/
- …