Searched for: subject%3A%22automatic%255C%2Bspeech%255C%2Brecognition%22
(1 - 20 of 26)

Pages

document
Gallardo, L.F. (author), Möller, S. (author), Beerends, J. (author)
The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility over transmission channels. Different to...
conference paper 2017
document
Truong, K.P. (author), van Leeuwen, D.A. (author), Neerincx, M.A. (author), de Jong, F.M.G. (author), TNO Defensie en Veiligheid (author)
In this paper, we describe emotion recognition experiments carried out for spontaneous affective speech with the aim to compare the added value of annotation of felt emotion versus annotation of perceived emotion. Using speech material available in the TNO-GAMING corpus (a corpus containing audiovisual recordings of people playing videogames),...
conference paper 2009
document
Huijbregts, M. (author), van Leeuwen, D.A. (author), de Jong, F.M.G. (author), TNO Defensie en Veiligheid (author)
In this paper we present a method for combining multiple diarization systems into one single system by applying a majority voting scheme. The voting scheme selects the best segmentation purely on basis of the output of each system. On our development set of NIST Rich Transcription evaluation meetings the voting method improves our system on all...
conference paper 2009
document
Zekveld, A.A. (author), Kramer, S.E. (author), Kessens, J.M. (author), Vlaming, M.S.M.G. (author), Houtgast, T. (author), TNO Defensie en Veiligheid (author)
This study examined the subjective benefit obtained from automatically generated captions during telephone-speech comprehension in the presence of babble noise. Short stories were presented by telephone either with or without captions that were generated offline by an automatic speech recognition (ASR) system. To simulate online ASR, the word...
article 2009
document
van Leeuwen, D.A. (author), Kessens, J. (author), Sanders, E. (author), van den Heuvel, H. (author), TNO Defensie en Veiligheid (author)
In this paper we report the results of a Dutch speech recognition system evaluation held in 2008. The evaluation contained material in two domains: Broadcast News (BN) and Conversational Telephone Speech (CTS) and in two main accent regions (Flemish and Dutch). In total 7 sites submitted recognition results to the evaluation, totalling 58...
conference paper 2009
document
Orr, R. (author), van Leeuwen, D.A. (author), TNO Defensie en Veiligheid (author)
In this study, we explore a human benchmark in language recognition, for the purpose of comparing human performance to machine performance in the context of the NIST LRE 2007. Humans are categorised in terms of language proficiency, and performance is presented per proficiency. Themain challenge in this work is the design of a test and...
conference paper 2009
document
Huijbregts, M. (author), van Leeuwen, D.A. (author), de Jong, F.M.G. (author), TNO Defensie en Veiligheid (author)
In this paper we present the two-pass speaker diarization system that we developed for the NIST RT09s evaluation. In the first pass of our system a model for speech overlap detection is generated automatically. This model is used in two ways to reduce the diarization errors due to overlapping speech. First, it is used in a second diarization...
conference paper 2009
document
van Leeuwen, D.A. (author), TNO Defensie en Veiligheid (author)
In this paper we propose a framework for measuring the overall performance of an automatic speaker recognition system using a set of trials of a heterogeneous evaluation such as NIST SRE- 2008, which combines several acoustic conditions in one evaluation. We do this by weighting trials of different conditions according to their relative...
conference paper 2009
document
Zekveld, A.A. (author), Kramer, S.E. (author), Kessens, J.M. (author), Vlaming, M.S.M.G. (author), Houtgast, T. (author), TNO Kwaliteit van Leven (author)
OBJECTIVES: The aim of this study was to evaluate the benefit that listeners obtain from visually presented output from an automatic speech recognition (ASR) system during listening to speech in noise. DESIGN: Auditory-alone and audiovisual speech reception thresholds (SRTs) were measured. The SRT is defined as the speech-to-noise ratio at which...
article 2008
document
Truong, K.P. (author), Neerincx, M.A. (author), van Leeuwen, D.A. (author), TNO Defensie en Veiligheid (author)
We investigated inter-observer agreement and the reliability of self-reported emotion ratings (i.e., self-raters judging their own emotions) in spontaneous multimodal emotion data. During a multiplayer video game, vocal and facial expressions were recorded (including the game content itself) and were annotated by the players themselves on...
conference paper 2008
document
Truong, K.P. (author), van Leeuwen, D.A. (author), TNO Defensie en Veiligheid (author)
In this paper, we present a detection approach and an ‘open-set’ detection evaluation methodology for automatic emotion recognition in speech. The traditional classification approach does not seem to be suitable and flexible enough for typical emotion recognition tasks. For example, classification does not have an appropriate way to cope with ...
conference paper 2007
document
Truong, K. (author), van Leeuwen, D. (author), TNO Defensie en Veiligheid (author)
Emotions can be recognized by audible paralinguistic cues in speech. By detecting these paralinguistic cues that can consist of laughter, a trembling voice, coughs, changes in the intonation contour etc., information about the speaker’s state and emotion can be revealed. This paper describes the development of a gender-independent laugh detector...
article 2007
document
Merkx, P.A.B. (author), Truong, K.P. (author), Neerincx, M.A. (author), TNO Defensie en Veiligheid (author)
To develop an annotated database of spontaneous, multimodal, emotional expressions, recordings were made of facial and vocal expressions of emotions while participants were playing a multiplayer first-person shooter (fps) computer game. During a replay of the session, participants scored their own emotions by assigning values to them on an...
conference paper 2007
document
Truong, K.P. (author), van Leeuwen, D.A. (author), TNO Defensie en Veiligheid (author)
In this study, we investigated automatic laughter segmentation in meetings. We first performed laughterspeech discrimination experiments with traditional spectral features and subsequently used acousticphonetic features. In segmentation, we used Gaussian Mixture Models that were trained with spectral features. For the evaluation of the laughter...
conference paper 2007
document
van Leeuwen, D.A. (author), Huijbregts, Marijn (author), TNO Defensie en Veiligheid (author)
We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that generalizes the Detection Erorr Tradeoff analysis commonly used in speaker detection tasks. The speaker diarization systems are...
conference paper 2006
document
van Leeuwen, D.A. (author), Brümmer, Niko (author), TNO Defensie en Veiligheid (author)
This paper describes two new approaches to spoken language recognition. These were both successfully applied in the NIST 2005 Language Recognition Evaluation. The first approach extends the Gaussian Mixture Model technique with channel dependency, which results in actual detection costs (CDET) of 0.095 in NIST LRE-2005, and which should be...
conference paper 2006
document
Kessens, J.M. (author), TNO Defensie en Veiligheid (author)
Dit onderzoek laat zien dat spraakherkenning van spraak met buitenlandse accenten verbetert door het toevoegen van uitspraakvarianten en door het aanpassen van de akoestische modellen met spraakopnamens van deze buitenlandse accenten
conference paper 2006
document
Brümmer, Niko (author), van Leeuwen, D.A. (author), TNO Defensie en Veiligheid (author)
Recent publications have examined the topic of calibration of confidence scores in the field of (binary-hypothesis) speaker detection. We extend this topic to the case of multiple-hypothesis language recognition. We analyze the structure of multiple-hypothesis recognition problems to show that any such problem subsumes a multitude of derived sub...
conference paper 2006
document
van Leeuwen, D.A. (author), TNO Defensie en Veiligheid (author)
The TNO speaker speaker diarization system is based on a standard BIC segmentation and clustering algorithm. Since for the NIST Rich Transcription speaker dizarization evaluation measure correct speech detection appears to be essential, we have developed a speech activity detector (SAD) as well. This is based on decoding the speech signal using...
conference paper 2006
document
van Leeuwen, D.A. (author), TNO Defensie en Veiligheid (author)
Abstract. The TNO speaker speaker diarization system is based on a standard BIC segmentation and clustering algorithm. Since for the NIST Rich Transcription speaker dizarization evaluation measure correct speech detection appears to be essential, we have developed a speech activity detector (SAD) as well. This is based on decoding the speech...
conference paper 2005
Searched for: subject%3A%22automatic%255C%2Bspeech%255C%2Brecognition%22
(1 - 20 of 26)

Pages