Search CORE

2 research outputs found

Deriving a domain specific test collection from a query log

Author: Avi Arampatzis
Jaap Kamps
Marijn Koolen Nir Nussbaum
Publication venue
Publication date: 01/01/2007
Field of study

Cultural heritage, and other special domains, pose a particular problem for information retrieval: evaluation requires a dedicated test collection that takes the particular documents and information requests into account, but building such a test collection requires substantial human effort. This paper investigates methods of generating a document retrieval test collection from a search engine’s transaction log, based on submitted queries and user-click data. We test our methods on a museum’s search log file, and compare the quality of the generated test collections against a collection with manually generated and judged known-item topics. Our main findings are the following. First, the test collection derived from a transaction log corresponds well to the actual search experience of real users. Second, the ranking of systems based on the derived judgments corresponds well to the ranking based on the manual topics. Third, deriving pseudo-relevance judgments from a transaction log file is an attractive option in domains where dedicated test collections are not readily available.

CiteSeerX

Deriving a Domain Specific Test Collection from a Query Log

Author: Avi Arampatzis
Jaap Kamps
Marijn Koolen Nir Nussbaum
Publication venue
Publication date: 04/11/2009
Field of study

CiteSeerX