Robust Speech/Non-Speech Classification in Heterogeneous Multimedia Content
Huijbregts, Marijn and Jong de, Franciska (2011) Robust Speech/Non-Speech Classification in Heterogeneous Multimedia Content. Speech Communication, 53 (2). pp. 143-153. ISSN 0167-6393
| PDF Restricted to UT campus only: Request a copy 335Kb |
| Abstract: | In this paper we present a speech/non-speech classification method that allows high quality classification without the need to know in advance what kinds of audible non-speech events are present in an audio recording and that does not require a single parameter to be tuned on in-domain data. Because no parameter tuning is needed and no training data is required to train models for specific sounds, the classifier is able to process a wide range of audio types with varying conditions and thereby contributes to the development of a more robust automatic speech recognition framework. Our speech/non-speech classification system does not attempt to classify all audible non-speech in a single run. Instead, first a bootstrap speech/silence classification is obtained using a standard speech/non-speech classifier. Next, models for speech, silence and audible non-speech are trained on the target audio using the bootstrap classification. The experiments show that the performance of the proposed system is 83% and 44% (relative) better than that of a common broadcast news speech/non-speech classifier when applied to a collection
of meetings recorded with table-top microphones and a collection of Dutch television broadcasts used for TRECVID 2007. |
| Item Type: | Article |
| Copyright: | © 2010 Elsevier |
| Faculty: | Electrical Engineering, Mathematics and Computer Science (EEMCS) |
| Research Group: | |
| Link to this item: | http://purl.utwente.nl/publications/75066 |
| Official URL: | http://dx.doi.org/10.1016/j.specom.2010.08.008 |
| Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page

Show download statistics for this publication
Show download statistics for this publication