NIST and NFI-TNO evaluations of automatic speaker recognition
van Leeuwen, D.A.
TNO Defensie en Veiligheid
Campell, J.P. (editor)
Mason, J. (editor)
Ortega-Garcia, J. (editor)
In the past years, several text-independent speaker recognition evaluation campaigns have taken place. This paper reports on results of the NIST evaluation of 2004 and the NFI-TNO forensic speaker recognition evaluation held in 2003, and reflects on the history of the evaluation campaigns. The effects of speech duration, training handsets, transmission type, and gender mix show expected behaviour on the DET curves. New results on the influence of language show an interesting dependence of the DET curves on the accent of speakers. We also report on a number of statistical analysis techniques that have recently been introduced in the speaker recognition community, as well as a new application of the analysis of deviance analysis. These techniques are used to determine that the two evaluations held in 2003, by NIST and NFI-TNO, are of statistically different difficulty to the speaker recognition systems.
Acoustics and Audiology
To reference this document use:
Pattern recognition systems
Automatic speaker recognition
Computer Speech and Language, 20 (2-3 SPEC. ISS.), 128-158
Odyssey 2004: The Speaker and Language Recognition Workshop Odyssey-04, 31 May 2004 through 3 June 2004, Conference