Compound Decomposition in Dutch Large Vocabulary Speech Recognition
Ordelman, Roeland and Hessen van, Arjan and Jong de, Franciska (2003) Compound Decomposition in Dutch Large Vocabulary Speech Recognition. In: Eurospeech 2003, September 1-4, 2003, Geneva, Switzerland.
| PDF Restricted to UT campus only: Request a copy 86Kb |
| Abstract: | This paper addresses compound splitting for Dutch in the context of broadcast news transcription. Language models were created using original text versions and text versions that were decomposed using a data-driven compound splitting algorithm. Language model performances were compared in terms of out-of- vocabulary rates and word error rates in a real-world broadcast news transcription task. It was concluded that compound splitting does improve ASR performance. Best results were obtained when frequent compounds were not decomposed. |
| Item Type: | Conference or Workshop Item |
| Faculty: | Electrical Engineering, Mathematics and Computer Science (EEMCS) |
| Research Group: | |
| Link to this item: | http://purl.utwente.nl/publications/63377 |
| Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page
Metis ID: 217551

Show download statistics for this publication
Show download statistics for this publication