Title
Predicting automatic speech recognition performance over communication channels from instrumental speech quality and intelligibility scores
Author
Gallardo, L.F.
Möller, S.
Beerends, J.
Contributor
Lacerda, F. (editor)
Strombergsson, S. (editor)
Wlodarczak, M. (editor)
Heldner, M. (editor)
Gustafson, J. (editor)
House, D. (editor)
Publication year
2017
Abstract
The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility over transmission channels. Different to previous studies, the effects of super-wideband transmissions are analyzed and compared to those of wideband and narrowband channels. Furthermore, intelligibility scores, gathered by conducting a listening test based on logatomes, are also considered for the prediction of automatic speech recognition results. The modern instrumental measurement techniques POLQA and POLQA-based intelligibility have been respectively applied to estimate the quality and the intelligibility of transmitted speech. Based on our results, polynomial models are proposed that permit the prediction of speech recognition accuracy from the subjective and instrumental measures, involving a number of channel distortions in the three bandwidths. This approach can save the costs of performing automatic speech recognition experiments and can be seen as a first step towards a useful tool for communication channel designers. Copyright © 2017 ISCA. Amazon Alexa; Apple; DiDi; et al.; Furhat Robotics; Microsoft
Subject
2016 ICT
CSR - Cyber Security & Robustness
TS - Technical Sciences
Automatic speech recognition
Communication channels
Instrumental speech quality
Speech intelligibility
Communication channels (information theory)
Forecasting
Speech
Speech communication
Speech intelligibility
Speech transmission
Channel distortions
Instrumental measurements
Intelligibility scores
Recognition accuracy
Speech quality
Speech recognition performance
Transmission channels
Speech recognition
To reference this document use:
http://resolver.tudelft.nl/uuid:349d7cf6-470a-4f8c-ab73-1c5d1192b724
TNO identifier
782878
Publisher
International Speech Communication Association
ISSN
2308-457X
Source
18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. 20 August 2017 through 24 August 2017, 2939-2943
Document type
conference paper