Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections
Huijbregts, Marijn and Wooters, Chuck and Ordelman, Roeland (2007) Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections. In: Interspeech 2007, 27-31 August 2007, Antwerp, Belgium.
| PDF 154Kb |
| Abstract: | In this paper we discuss the speech activity detection system that we used for detecting speech regions in the Dutch TRECVID video collection. The system is designed to filter non-speech like music or sound effects out of the signal without the use of predefined non-speech models. Because the system trains its models on-line, it is robust for handling out-of-domain data. The speech activity error rate on an out-of-domain test set, recordings of English conference meetings, was 4.4%. The overall error rate on twelve randomly selected five minute TRECVID fragments was 11.5%. |
| Item Type: | Conference or Workshop Item |
| Faculty: | Electrical Engineering, Mathematics and Computer Science (EEMCS) |
| Research Group: | |
| Link to this item: | http://purl.utwente.nl/publications/64329 |
| Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page
Metis ID: 241881

Show download statistics for this publication
Show download statistics for this publication