Location of Repository

Ranking using multiple document types in desktop search

By Jinyoung Kim and W. Bruce Croft

Abstract

A typical desktop environment contains many document types (email, presentations, web pages, pdfs, etc.) each with different metadata. Predicting which types of documents a user is looking for in the context of a given query is a crucial part of providing effective desktop search. The problem is similar to selecting resources in distributed IR, but there are some important differences. In this paper, we quantify the impact of type prediction in producing a merged ranking for desktop search and introduce a new prediction method that exploits type-specific metadata. In addition, we show that type prediction performance and search effectiveness can be further enhanced by combining existing methods of type prediction using discriminative learning models. Our experiments employ pseudodesktop collections and a human computation game for acquiring realistic and reusable queries

Topics: Desktop Search, Semi-structured Document Retrieval, Type Prediction
Publisher: ACM
Year: 2013
OAI identifier: oai:CiteSeerX.psu:10.1.1.359.5235
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://ciir-publications.cs.um... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.