Phoneme based spoken document retrieval

Kraaij, W.; Gent, J. van; Ekkelenkamp, R.; Leeuwen, D.A. van

Phoneme based spoken document retrieval

conference paper

1998

Kraaij, W.

Gent, J. van

Ekkelenkamp, R.

Leeuwen, D.A. van

Since speech recognition technology has become more and more mature, retrieval of spoken documents has become a feasible task. We report about two cases, which aim at scalable and effective retrieval of broadcast recordings. The approach is based on a hybrid architecture, which combines the speed of off-line phoneme indexing and precision of wordspotting while maintaining a scalable architecture, which allows for frequent updates of the database where out-of-vocabulary (OOV) words are abundant. A pilot experiment has been done on a small database of recordings of a Dutch talkshow. A more extensive evaluation took place in the framework of the Spoken Document Retrieval track of TREC7 on English broadcast news.

Bij Spoken Document Retrieval zijn twee technieken vergeleken: foonherkenning met indexering, en wordspotting. Het blijkt dat een hybride techniek de beste resultaten geeft.

Topics

Spoken Document Retrieval Speech Recognition Radio broadcast databases retrieval system speech recognition

TNO Identifier

12359

Repository link

https://resolver.tno.nl/uuid:d789814a-960b-4dcc-b952-7d52e57d6cf0

Source title

Proceedings of Twente workshop on Language Technology (TWLT14): Language Technology in Multimedia Information
Retrieval, December 1998

Pages

1-12

Files

To receive the publication files, please send an e-mail request to TNO Repository.

Phoneme based spoken document retrieval

Make TNO yours!