In search of optimal data selection for training of automatic speech recognition systems
conference paper
This paper presents an extended study in the topic of optimal selection of speech data from a database for efficient training of ASR systems. We reconsider a method of optimal selection introduced in our previous work and introduce variosearch as an alternative selection method developed in order to find a representative sample of speech data with a simultaneous control of acoustical and statistical parameters of data selected. Next, we present experiments in which the performance of a standard ASR system trained with data sets selected from a Dutch digits database via different selection methods was compared. The results show that the length of training utterances has a dominant impact on the recognition performance. Therefore, the length of the utterances is a factor that must be taken into account when interpreting phoneme recognition scores. © 2003 IEEE.
Topics
TNO Identifier
953691
ISBN
0780379802
Publisher
Institute of Electrical and Electronics Engineers Inc.
Article nr.
1318405
Source title
2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003, IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003, 30 November 2003 through 4 December 2003
Pages
67-72
Files
To receive the publication files, please send an e-mail request to TNO Repository.