2 research outputs found

    Discriminative feature weighting using MCE training for topic identification of spoken audio recordings

    No full text
    In this paper we investigate a discriminative approach to feature weighting for topic identification using minimum classification error (MCE) training. Our approach learns feature weights by optimizing an objective loss function directly related to the classification error rate of the topic identification system. Topic identification experiments are performed on spoken conversations from the Fisher corpus. Features drawn from both word and phone lattices generated via automatic speech recognition are investigated. Under various different conditions, our new feature weighting scheme reduces our classification error rate between 9 % and 23 % relative to our baseline naive Bayes system using feature selection. Index Terms β€” Audio document processing, topic identification, topic spotting. 1

    Topic-enhanced Models for Speech Recognition and Retrieval

    Get PDF
    This thesis aims to examine ways in which topical information can be used to improve recognition and retrieval of spoken documents. We consider the interrelated concepts of locality, repetition, and `subject of discourse' in the context of speech processing applications: speech recognition, speech retrieval, and topic identification of speech. This work demonstrates how supervised and unsupervised models of topics, applicable to any language, can improve accuracy in accessing spoken content. This work looks at the complementary aspects of topic information in lexical content in terms of local context - locality or repetition of word usage - and broad context - the typical `subject matter' definition of a topic. By augmenting speech processing language models with topic information we can demonstrate consistent improvements in performance in a number of metrics. We add locality to bags-of-words topic identification models, we quantify the relationship between topic information and keyword retrieval, and we consider word repetition both in terms of keyword based retrieval and language modeling. Lastly, we combine these concepts and develop joint models of local and broad context via latent topic models. We present a latent topic model framework that treats documents as arising from an underlying topic sequence combined with a cache-based repetition model. We analyze our proposed model both for its ability to capture word repetition via the cache and for its suitability as a language model for speech recognition and retrieval. We show this model, augmented with the cache, captures intuitive repetition behavior across languages and exhibits lower perplexity than regular LDA on held out data in multiple languages. Lastly, we show that our joint model improves speech retrieval performance beyond N-grams or latent topics alone, when applied to a term detection task in all languages considered
    corecore