A Support Vector Machine Approach to Dutch Part-of-Speech Tagging


Share/Save/Bookmark

Poel, M. and Stegeman, L. and Akker, H.J.A. op den (2007) A Support Vector Machine Approach to Dutch Part-of-Speech Tagging. In: Advances in Intelligent Data Analysis VII. Proceedings of the 7th International Symposium on Intelligent Data Analysis, IDA 2007, 6-8 Sept 2007, Ljubljana, Slovenia (pp. pp. 274-283).

[img] PDF
Restricted to UT campus only
: Request a copy
138kB
Abstract:Part-of-Speech tagging, the assignment of Parts-of-Speech to the words in a given context of use, is a basic technique in many systems that handle natural languages. This paper describes a method for supervised training of a Part-of-Speech tagger using a committee of Support Vector Machines on a large corpus of annotated transcriptions of spoken Dutch. Special attention is paid to the decomposition of the large data set into parts for common, uncommon and unknown words. This does not only solve the space problems caused by the amount of data, it also improves the tagging time. The performance of the resulting tagger in terms of accuracy is 97.54%, which is quite good, where the speed of the tagger is reasonably good.
Item Type:Conference or Workshop Item
Faculty:
Electrical Engineering, Mathematics and Computer Science (EEMCS)
Research Group:
Link to this item:http://purl.utwente.nl/publications/61912
Official URL:http://dx.doi.org/10.1007/978-3-540-74825-0_25
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page

Metis ID: 241907