Degradation decomposition of the perceived quality of speech signals on the basis of a perceptual modeling approach

article
The authors discuss the way we perceive the quality of a speech signal and how different degradations contribute to the overall perceived speech (listening) quality. More specifically, ITU-T Recommendation P.862 (perceptual evaluation of speech quality-PESQ), which provides a perceptual modeling approach with which the subjectively perceived speech quality can be predicted, is used as a starting point for a degradation decomposition algorithm. This algorithm decomposes the perceived degradation into three different contributions by finding specific degradation indicators that quantify the impact of each type of degradation separately. The first degradation indicator quantifies the impact of additive noise as found in many speech-processing situations, such as when unwanted background noise is sent over a voice connection. The second degradation indicator quantifies the impact of linear timeinvariant frequency response distortions as. for example, introduced by a band-limited telephone system. The last degradation indicator quantifies the impact of the time-varying behavior of the system under test. This time response degradation indicator quantifies the impact of temporal signal loss, as found with packet loss in modern digital speech connections, and the impact of pulses (clicks) as found in many speech-processing systems.
TNO Identifier
240417
ISSN
15494950
Source
AES: Journal of the Audio Engineering Society, 55(12), pp. 1059-1074.
Pages
1059-1074
Files
To receive the publication files, please send an e-mail request to TNO Repository.