Predicting the cost-quality trade-off for information retrieval queries: Facilitating database design and query optimization


Share/Save/Bookmark

Blok, H.E. and Hiemstra, D. and Choenni, R.S. and Jong, F.M.G. de and Blanken, H.M. and Apers, P.M.G. (2001) Predicting the cost-quality trade-off for information retrieval queries: Facilitating database design and query optimization. In: Proceedings of the tenth international conference on Information and knowledge management (CIKM 2001), 5-10 Nov 2001, Atlanta, Georgia, USA (pp. pp. 207-214).

[img] PDF
Restricted to UT campus only
: Request a copy
1MB
Abstract:Efficient, flexible, and scalable integration of full text information retrieval (IR) in a DBMS is not a trivial case. This holds in particular for query optimization in such a context. To facilitate the bulk-oriented behavior of database query processing, a priori knowledge of how to limit the data efficiently prior to query evaluation is very valuable at optimization time. The usually imprecise nature of IR querying provides an extra opportunity to limit the data by a trade-off with the quality of the answer. In this paper we present a mathematically derived model to predict the quality implications of neglecting information before query execution. In particular we investigate the possibility to predict the retrieval quality for a document collection for which no training information is available, which is usually the case in practice. Instead, we construct a model that can be trained on other document collections for which the necessary quality information is available, or can be obtained quite easily. We validate our model for several document collections and present the experimental results. These results show that our model performs quite well, even for the case were we did not train it on the test collection itself.
Item Type:Conference or Workshop Item
Additional information:Imported from EWI/DB PMS [db-utwente:inpr:0000000020]
Faculty:
Electrical Engineering, Mathematics and Computer Science (EEMCS)
Research Group:
Link to this item:http://purl.utwente.nl/publications/63491
Official URL:http://doi.acm.org/10.1145/502585.502621
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page

Metis ID: 202660