Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections

Share/Save/Bookmark

Huijbregts, Marijn and Wooters, Chuck and Ordelman, Roeland (2007) Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections. In: Interspeech 2007, 27-31 August 2007, Antwerp, Belgium (pp. FrC.P3-4).

open access
[img]
Preview
PDF
158kB
Abstract:In this paper we discuss the speech activity detection system that we used for detecting speech regions in the Dutch TRECVID video collection. The system is designed to filter non-speech like music or sound effects out of the signal without the use of predefined non-speech models. Because the system trains its models on-line, it is robust for handling out-of-domain data. The speech activity error rate on an out-of-domain test set, recordings of English conference meetings, was 4.4%. The overall error rate on twelve randomly selected five minute TRECVID fragments was 11.5%.
Item Type:Conference or Workshop Item
Faculty:
Electrical Engineering, Mathematics and Computer Science (EEMCS)
Research Group:
Link to this item:http://purl.utwente.nl/publications/64329
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page

Metis ID: 241881