Article thumbnail
Location of Repository

Algorithms, Experimentation

By Chao Liu, Anitha Kannan, Tom Minka, Michael Taylor, Yi-min Wang, Christos Faloutsos and Fan Guo

Abstract

Given a terabyte click log, can we build an efficient and effective click model? It is commonly believed that web search click logs are a gold mine for search business, because they reflect users ’ preference over web documents presented by the search engine. Click models provide a principled approach to inferring user-perceived relevance of web documents, which can be leveraged in numerous applications in search businesses. Due to the huge volume of click data, scalability is a must. We present the click chain model (CCM), which is based on a solid, Bayesian framework. It is both scalable and incremental, perfectly meeting the computational challenges imposed by the voluminous click logs that constantly grow. We conduct an extensive experimental study on a data set containing 8.8 million query sessions obtained in July 2008 from a commercial search engine. CCM consistently outperforms two state-of-the-art competitors in a number of metrics, with over 9.7 % better log-likelihood, over 6.2 % better click perplexity and much more robust (up to 30%) prediction of the first and the last clicked position

Year: 2010
OAI identifier: oai:CiteSeerX.psu:10.1.1.153.2216
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.research.microsoft.... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.