MapReduce for information retrieval evaluation: "Let's quickly test this on 12 TB of data"
Hiemstra, Djoerd and Hauff, Claudia (2010) MapReduce for information retrieval evaluation: "Let's quickly test this on 12 TB of data". In: International Conference of the Cross-Language Evaluation Forum, CLEF: Multilingual and Multimodal Information Access Evaluation, 20-23 September 2010, Padua, Italy (pp. pp. 64-69).
|Abstract:||We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use a cluster of 15 low cost machines to search a web crawl of 0.5 billion pages showing that sequential scanning is a viable approach to running large-scale information retrieval experiments with little effort. The code is available to other researchers at: http://mirex.sourceforge.net.|
|Item Type:||Conference or Workshop Item|
|Copyright:||© 2010 Springer|
Electrical Engineering, Mathematics and Computer Science (EEMCS)
|Link to this item:||http://purl.utwente.nl/publications/73226|
|Export this item as:||BibTeX|
Daily downloads in the past month
Monthly downloads in the past 12 months
Repository Staff Only: item control page