Deriving a Bilingual Lexicon for Cross-Language Information Retrieval


Hiemstra, D. (1997) Deriving a Bilingual Lexicon for Cross-Language Information Retrieval. In: Fourth Groningen International Information Technology Conference for Students, GRONICS 1997, December 9, 1997, Groningen, the Netherlands (pp. pp. 21-26).

open access
Abstract:In this paper we describe a systematic approach to derive a bilingual lexicon automatically from parallel corpora. Following this approach, a lexicon was derived from the English and Dutch version of the Agenda 21 corpus. With the lexicon and a part of the corpus that was not used to derive the lexicon, a bilingual retrieval environment was build. Recall and precision of monolingual (Dutch) retrieval was compared to recall and precision of bilingual (Dutch-to-English) retrieval. An experiment was conducted with the help of eight naive users who formulated queries and judged the relevance of retrieved fragments. The experiment shows 78% precision and 51% relative recall of monolingual retrieval, against 67% precision and 82% relative recall of bilingual retrieval.
Item Type:Conference or Workshop Item
Electrical Engineering, Mathematics and Computer Science (EEMCS)
Research Group:
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page

Metis ID: 122303