3,288 research outputs found
Speech Recognition by Composition of Weighted Finite Automata
We present a general framework based on weighted finite automata and weighted
finite-state transducers for describing and implementing speech recognizers.
The framework allows us to represent uniformly the information sources and data
structures used in recognition, including context-dependent units,
pronunciation dictionaries, language models and lattices. Furthermore, general
but efficient algorithms can used for combining information sources in actual
recognizers and for optimizing their application. In particular, a single
composition algorithm is used both to combine in advance information sources
such as language models and dictionaries, and to combine acoustic observations
and information sources dynamically during recognition.Comment: 24 pages, uses psfig.st
Visual units and confusion modelling for automatic lip-reading
Automatic lip-reading (ALR) is a challenging task because the visual speech signal is known to be missing some important information, such as voicing. We propose an approach to ALR that acknowledges that this information is missing but assumes that it is substituted or deleted in a systematic way that can be modelled. We describe a system that learns such a model and then incorporates it into decoding, which is realised as a cascade of weighted finite-state transducers. Our results show a small but statistically significant improvement in recognition accuracy. We also investigate the issue of suitable visual units for ALR, and show that visemes are sub-optimal, not but because they introduce lexical ambiguity, but because the reduction in modelling units entailed by their use reduces accuracy
Weighted finite-state transducers in speech recognition : a compaction algorithm for non-determinizable transducers
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal
- …