Progress in the AMIDA speaker diarization system for meeting data

Leeuwen, D.A. van; Konečný, M.

Progress in the AMIDA speaker diarization system for meeting data

conference paper

2008

Leeuwen, D.A. van

Konečný, M.

In this paper we describe the AMIDA speaker dizarization system as it was submitted to the NIST Rich Transcription evaluation 2007 for conference room data. This is done in the context of the history of this system and other speaker diarization systems. One of the goals of our system is to have as little tunable parameters as possible, while maintaining performance. The system consists of a BIC segmentation/clustering initialization, followed by a combined re-segmentation cluster merging algorithm. The Diarization Error Rate (DER) result of our best system is 17.0 %, accounting for overlapping speech. However, we find that a slight altering of Speech Activity Detection models has a large impact on the speaker DER. © 2008 Springer-Verlag Berlin Heidelberg.

Topics

Error analysis Transcription Cluster merging Error Rate (ER)Heidelberg (CO)International (CO)Multi-modal Speaker diarization Speech Activity Detection (SAD)Tunable parameters Speech

TNO Identifier

240920

Repository link

https://resolver.tno.nl/uuid:490c7a46-754e-4b70-9cc3-6bb3b05d99df

ISSN

03029743

Source title

2nd Annual Classifcation of Events Activities and Relationships, CLEAR 2007 and Rich Transcription, RT 2007, 8 May 2007 through 11 May 2007, Baltimore, MD, Conference code: 72688

Pages

475-483

Files

To receive the publication files, please send an e-mail request to TNO Repository.

Progress in the AMIDA speaker diarization system for meeting data

Make TNO yours!