3 research outputs found

    A frame and segment based approach for topic spotting

    No full text
    In this paper we present a new approach for topic spotting based on subword units (phonemes and feature vectors) instead of words. Classification of topics is done by running topic dependent polygram language models over these symbol sequences and deciding for the one with the best score. We trained and tested the two methods on three different corpora. The first is a part of a media corpus which contains data from TV shows for three different topics (IDS), the second is part of the Switchboard corpus, the third is a collection of human machine dialogs about train timetable information (EVAR corpus). The results on Switchboard are compared with phoneme based approaches which were made at CRIM (Montreal) and DRA (Malvern) and are presented as ROC curves; the results on IDS and EVAR are compared with a word based approach and presented as confusion tables. We show that a surprisingly little amount of recognition accuracy is lost when going from word to subword based topic spotting. (orig.)Appeared in proceedings EUROSPEECH '97, Rhodes (ZA), vol. 1, p. 275-278Available from TIB Hannover: RR 5221(217)+a / FIZ - Fachinformationszzentrum Karlsruhe / TIB - Technische InformationsbibliothekSIGLEBundesministerium fuer Bildung, Wissenschaft, Forschung und Technologie, Bonn (Germany)DEGerman
    corecore