464 research outputs found

    Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation

    Get PDF
    We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and prosodic information using hidden Markov models and decision trees. Lexical information is obtained from a speech recognizer, and prosodic features are extracted automatically from speech waveforms. We evaluate our approach on the Broadcast News corpus, using the DARPA-TDT evaluation metrics. Results show that the prosodic model alone is competitive with word-based segmentation methods. Furthermore, we achieve a significant reduction in error by combining the prosodic and word-based knowledge sources.Comment: 27 pages, 8 figure

    Towards a Maximum Entropy Method for Estimating HMM Parameters

    Get PDF
    Training a Hidden Markov Model (HMM) to maximise the probability of a given sequence can result in over-fitting. That is, the model represents the training sequence well, but fails to generalise. In this paper, we present a possible solution to this problem, which is to maximise a linear combination of the likelihood of the training data, and the entropy of the model. We derive the necessary equations for gradient based maximisation of this combined term. The performance of the system is then evaluated in comparison with three other algorithms, on a classification task using synthetic data. The results indicate that the method is potentially useful. The main problem with the method is the computational intractability of the entropy calculation

    Hidden Markov Models for Spatio-Temporal Pattern Recognition and Image Segmentation

    Get PDF
    Time and again hidden Markov models have been demonstrated to be highly effective in one-dimensional pattern recognition and classification problems such as speech recognition. A great deal of attention is now focussed on 2-D and possibly 3-D applications arising from problems encountered in computer vision in domains such as gesture, face, and handwriting recognition. Despite their widespread usage and numerous successful applications, there are few analytical results which can explain their remarkably good performance and guide researchers in selecting topologies and parameters to improve classification performance

    Progress in Speech Recognition for Romanian Language

    Get PDF
    corecore