MapReduce for Information Retrieval Evaluation: “Let’s Quickly Test This on 12 TB of Data”


Share/Save/Bookmark

Hiemstra, Djoerd and Hauff, Claudia (2010) MapReduce for Information Retrieval Evaluation: “Let’s Quickly Test This on 12 TB of Data”. In: Multilingual and Multimodal Information Access Evaluation: International Conference of the Cross-Language Evaluation Forum, CLEF 2010 Padua, Italy, September 20-23, 2010 Proceedings. Lecture Notes in Computer Science , 6360 . Springer, pp. 64-69. ISBN INVALID978-3-642-15998-5_8

[img]PDF
Restricted to UT campus only

117Kb
Abstract:We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use a cluster of 15 low cost machines to search a web crawl of 0.5 billion pages showing that sequential scanning is a viable approach to running large-scale information retrieval experiments with little effort. The code is available to other researchers at: http://mirex.sourceforge.net
Item Type:Book Section
Copyright:© Springer-Verlag Berlin
Faculty:
Electrical Engineering, Mathematics and Computer Science (EEMCS)
Research Group:
Link to this item:http://purl.utwente.nl/publications/72955
Official URL:http://dx.doi.org/10.1007/978-3-642-15998-5_8
Dataset URL:http://mirex.sourceforge.net
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page