MIREX: MapReduce Information Retrieval Experiments


Hiemstra, Djoerd and Hauff, Claudia (2010) MIREX: MapReduce Information Retrieval Experiments. [Report]

open access
Abstract:We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use a cluster of 15 low cost machines to search a web crawl of 0.5 billion pages showing that sequential scanning is a viable approach to running large-scale information retrieval experiments with little effort. The code is available to other researchers at: http://sourceforge.net/projects/mirex/
Item Type:Report
Electrical Engineering, Mathematics and Computer Science (EEMCS)
Research Group:
Link to this item:http://purl.utwente.nl/publications/71078
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page