Segmentation, diarization and speech transcription : surprise data unraveled


Share/Save/Bookmark

Huijbregts, Marijn Anthonius Henricus (2008) Segmentation, diarization and speech transcription : surprise data unraveled. thesis.

open access
[img] PDF
2MB
Abstract:In this thesis, research on large vocabulary continuous speech recognition for unknown audio conditions is presented. For automatic speech recognition systems based on statistical methods, it is important that the conditions of the audio used for training the statistical models match the conditions of the audio to be processed. Any mismatch will decrease the accuracy of the recognition. If it is unpredictable what kind of data can be expected, or in other words if the conditions of the audio to be processed are unknown, it is impossible to tune the models. If the material consists of `surprise data' the output of the system is likely to be poor. In this thesis methods are presented for which no external training data is required for training models. These novel methods have been implemented in a large vocabulary continuous speech recognition system called SHoUT. This system consists of three subsystems: speech/non-speech classification, speaker diarization and automatic speech recognition.
Item Type:Thesis
Research Group:
Link to this item:http://purl.utwente.nl/publications/60130
Official URL:http://dx.doi.org/10.3990/1.9789036527125
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page

Metis ID: 255485