Article thumbnail
Location of Repository

General Terms Algorithms, Experimentation

By R. Agrawal, A. Halverson, K. Kenthapadi, N. Mishra and P. Tsaparas


The ranking function used by search engines to order results is learned from labeled training data. Each training point is a (query, URL) pair that is labeled by a human judge who assigns a score of Perfect, Excellent, etc., depending on how well the URL matches the query. In this paper, we study whether clicks can be used to automatically generate good labels. Intuitively, documents that are clicked (resp., skipped) in aggregate can indicate relevance (resp., lack of relevance). We give a novel way of transforming clicks into weighted, directed graphs inspired by eye-tracking studies and then devise an objective function for finding cuts in these graphs that induce a good labeling. In its full generality, the problem is NP-hard, but we show that, in the case of two labels, an optimum labeling can be found in linear time. For the more general case, we propose heuristic solutions. Experiments on real click logs show that clickbased labels align with the opinion of a panel of judges, especially as the consensus of the panel grows stronger

Topics: Graph partitioning
Year: 2012
OAI identifier: oai:CiteSeerX.psu:
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)
  • (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.