Search CORE

48,895 research outputs found

Improving Posterior Based Confidence Measures in Hybrid HMM/ANN Speech Recognition Systems

Author: Bernardis Giulia
Bourlard Hervé
Publication venue: IDIAP
Publication date: 10/03/2006
Field of study

In this paper we define and investigate a set of confidence measures based on hybrid Hidden Markov Model/Artificial Neural Network (HMM/ANN) acoustic models. All these measures are using the neural network to estimate the local phone posterior probabilities, which are then combined and normalized in different ways. Experimental results will indeed show that the use of an appropriate duration normalization is very important to obtain good estimates of the phone and word confidences. The different measures are evaluated at the phone and word levels on both an isolated word task (PHONEBOOK) and a continuous speech recognition task (BREF). It will be shown that one of those confidence measures is well suited for utterance verification, and that (as one could expect) confidence measures at the word level perform better than those at the phone level. Finally, using the resulting approach on PHONEBOOK to rescore the N-best list is shown to yield a 34\% decrease in word error rate

Infoscience - École polytechnique fédérale de Lausanne

Confidence measures from local posterior probability estimates

Author: Renals Steve
Williams Gethin
Publication venue: 'Elsevier BV'
Publication date: 01/01/1999
Field of study

In this paper we introduce a set of related confidence measures for large vocabulary continuous speech recognition (LVCSR) based on local phone posterior probability estimates output by an acceptor HMM acoustic model. In addition to their computational efficiency, these confidence measures are attractive as they may be applied at the state-, phone-, word- or utterance-levels, potentially enabling discrimination between different causes of low confidence recognizer output, such as unclear acoustics or mismatched pronunciation models. We have evaluated these confidence measures for utterance verification using a number of different metrics. Experiments reveal several trends in `profitability of rejection', as measured by the unconditional error rate of a hypothesis test. These trends suggest that crude pronunciation models can mask the relatively subtle reductions in confidence caused by out-of-vocabulary (OOV) words and disfluencies, but not the gross model mismatches elicited by non-speech sounds. The observation that a purely acoustic confidence measure can provide improved performance over a measure based upon both acoustic and language model information for data drawn from the Broadcast News corpus, but not for data drawn from the North American Business News corpus suggests that the quality of model fit offered by a trigram language model is reduced for Broadcast News data. We also argue that acoustic confidence measures may be used to inform the search for improved pronunciation models

CiteSeerX

Edinburgh Research Archive

The THISL broadcast news retrieval system.

Author: Abberley Dave
Kirby David
Renals Steve
Robinson Tony
Publication venue
Publication date: 01/01/1999
Field of study

CiteSeerX

Edinburgh Research Archive

Contextual confidence measures for continuous speech recognition

Author: Hernández-Abrego G
Mariño Acebal José Bernardo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2000
Field of study

This paper explores the repercussion of contextual information into confidence measuring for continuous speech recognition results. Our approach comprises three steps: to extract confidence predictors out of recognition results, to compile those predictors into confidence measures by means of a fuzzy inference system whose parameters have been estimated, directly from examples, with an evolutionary strategy and, finally, to upgrade the confidence measures by the inclusion of contextual information. Through experimentation with two different continuous speech application tasks, results show that the context re-scoring procedure improves the capabilities of confidence measures to discriminate between correct and incorrect recognition results for every level of thresholding, even when a rather simple method to add contextual information is considered.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Fuzzy reasoning in confidence evaluation of speech recognition

Author: Hernández-Abrego G
Mariño Acebal José Bernardo
Publication venue: 'Baishideng Publishing Group Inc.'
Publication date: 01/01/1999
Field of study

Confidence measures represent a systematic way to express reliability of speech recognition results. A common approach to confidence measuring is to take profit of the information that several recognition-related features offer and to combine them, through a given compilation mechanism , into a more effective way to distinguish between correct and incorrect recognition results. We propose to use a fuzzy reasoning scheme to perform the information compilation step. Our approach opposes the previously proposed ones because ours treats the uncertainty of recognition hypotheses in terms ofPeer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Measuring Mimicry in Task-Oriented Conversations: The More the Task is Difficult, The More we Mimick our Interlocutors

Author: Smith Rachel
Solanki Vijay
Stuart-Smith Jane
Vinciarelli Alessandro
Publication venue
Publication date: 01/09/2015
Field of study

The tendency to unconsciously imitate others in conversations is referred to as mimicry, accommodation, interpersonal adap- tation, etc. During the last years, the computing community has made significant efforts towards the automatic detection of the phenomenon, but a widely accepted approach is still miss- ing. Given that mimicry is the unconscious tendency to imitate others, this article proposes the adoption of speaker verification methodologies that were originally conceived to spot people trying to forge the voice of others. Preliminary experiments suggest that mimicry can be detected by measuring how much speakers converge or diverge with respect to one another in terms of acoustic evidence. As a validation of the approach, the experiments show that convergence (the speakers become more similar in terms of acoustic properties) tends to appear more frequently when a task is difficult and, therefore, requires more time to be addressed

Enlighten