The 2005 AMI system for the transcription of speech in meetings


Hain, Thomas and Burget, Lukas and Dines, John and Gaurau, Giulia and Karafiat, Martin and Lincoln, Mike and McCowan, Iain and Moore, Darren and Wan, Vincent and Ordelman, Roeland and Renals, Steve (2005) The 2005 AMI system for the transcription of speech in meetings. In: 2nd International Workshop on Machine Learning for Multimodal Interaction, MLMI 2005, July 11-13, 2005, Edinburgh, UK (pp. pp. 450-462).

[img] PDF - Published Version
Restricted to UT campus only
: Request a copy
PDF - Submitted Version
Abstract:In this paper we describe the 2005 AMI system for the transcription
of speech in meetings used for participation in the 2005 NIST
RT evaluations. The system was designed for participation in the speech
to text part of the evaluations, in particular for transcription of speech
recorded with multiple distant microphones and independent headset
microphones. System performance was tested on both conference room
and lecture style meetings. Although input sources are processed using
different front-ends, the recognition process is based on a unified system
architecture. The system operates in multiple passes and makes use
of state of the art technologies such as discriminative training, vocal
tract length normalisation, heteroscedastic linear discriminant analysis,
speaker adaptation with maximum likelihood linear regression and minimum
word error rate decoding. In this paper we describe the system performance
on the official development and test sets for the NIST RT05s
evaluations. The system was jointly developed in less than 10 months
by a multi-site team and was shown to achieve very competitive performance.
Item Type:Conference or Workshop Item
Additional information:Imported from HMI
Electrical Engineering, Mathematics and Computer Science (EEMCS)
Research Group:
Link to this item:
Official URL:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page

Metis ID: 227319