MIREX: MapReduce Information Retrieval Experiments

Hauff, Claudia; Hiemstra, Djoerd

research

MIREX: MapReduce Information Retrieval Experiments

Authors: Claudia Hauff
Djoerd Hiemstra
Publication date: 1 January 2010
Publisher

Abstract

We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use a cluster of 15 low cost ma- chines to search a web crawl of 0.5 billion pages showing that sequential scanning is a viable approach to running large-scale information retrieval experiments with little effort. The code is available to other researchers at: http://mirex.sourceforge.ne

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.761.7...

Last time updated on 30/10/2017

NARCIS

Last time updated on 14/10/2017

University of Twente Research Information

oai:ris.utwente.nl:publication...

Last time updated on 12/07/2023