Relating the new language models of information retrieval to the traditional retrieval models
Hiemstra, Djoerd and Vries de, Arjen P. (2000) Relating the new language models of information retrieval to the traditional retrieval models. [Report]
| PDF 194Kb |
| Abstract: | During the last two years, exciting new approaches to information retrieval were introduced by a number of different research groups that use statistical language models for retrieval. This paper relates the retrieval algorithms suggested by these approaches to widely accepted retrieval algorithms developed within three traditional models of information retrieval: the Boolean model, the vector space model and the probabilistic model. The paper shows the existence of efficient retrieval algorithms that only use the matching terms in their computation. Under these conditions, the language models of information retrieval are surprisingly similar to both tf.idf term weighting as developed for the vector space model and relevance weighting as developed in the traditional probabilistic model. The paper suggests a new method for relevance weighting and a new method to rank documents giving Boolean queries. Experimental results on the TREC collection indicate that the language modelling approach outperforms the three traditional approaches. |
| Item Type: | Report |
| Copyright: | © 2000 Centre for Telematics and Information Technology, CTIT |
| Faculty: | Electrical Engineering, Mathematics and Computer Science (EEMCS) |
| Research Group: | |
| Link to this item: | http://purl.utwente.nl/publications/18200 |
| Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page
Metis ID: 118720

Show download statistics for this publication
Show download statistics for this publication