The (TNO) Speaker Diarization System for NIST Rich Transcription Evaluation 2005 for meeting data

other
Abstract. The TNO speaker speaker diarization system is based on a standard BIC segmentation and clustering algorithm. Since for the NIST Rich Transcription speaker dizarization evaluation measure correct speech detection appears to be essential, we have developed a speech activity detector (SAD) as well. This is based on decoding the speech
signal using two Gaussian Mixture Models trained on silence and speech. The SAD was trained on only AMI development test data, and performed quite well in the evaluation on all 5 meeting locations, with a SAD error rate of 5.0%. For the speaker clustering algorithm we optimized the BIC penalty parameter ë to 14, which is quite high with respect to the theoretical value of 1. The final speaker diarization error rate was
TNO Identifier
15992
Source title
Proceedings Rich Transcription 2005 Spring Meeting Recognition Evaluation Edinbrugh
Pages
84 - 92
Files
To receive the publication files, please send an e-mail request to TNO Repository.