4,413 research outputs found
Spectral learning of general weighted automata via constrained matrix completion
Student Paper Awards NIPS 2012Many tasks in text and speech processing and computational biology require estimating
functions mapping strings to real numbers. A broad class of such functions
can be defined by weighted automata. Spectral methods based on the singular
value decomposition of a Hankel matrix have been recently proposed for
learning a probability distribution represented by a weighted automaton from a
training sample drawn according to this same target distribution. In this paper, we
show how spectral methods can be extended to the problem of learning a general
weighted automaton from a sample generated by an arbitrary distribution. The
main obstruction to this approach is that, in general, some entries of the Hankel
matrix may be missing. We present a solution to this problem based on solving a
constrained matrix completion problem. Combining these two ingredients, matrix
completion and spectral method, a whole new family of algorithms for learning
general weighted automata is obtained. We present generalization bounds for a
particular algorithm in this family. The proofs rely on a joint stability analysis of
matrix completion and spectral learning.Peer ReviewedAward-winningPostprint (published version
Speech Recognition by Composition of Weighted Finite Automata
We present a general framework based on weighted finite automata and weighted
finite-state transducers for describing and implementing speech recognizers.
The framework allows us to represent uniformly the information sources and data
structures used in recognition, including context-dependent units,
pronunciation dictionaries, language models and lattices. Furthermore, general
but efficient algorithms can used for combining information sources in actual
recognizers and for optimizing their application. In particular, a single
composition algorithm is used both to combine in advance information sources
such as language models and dictionaries, and to combine acoustic observations
and information sources dynamically during recognition.Comment: 24 pages, uses psfig.st
Use of Weighted Finite State Transducers in Part of Speech Tagging
This paper addresses issues in part of speech disambiguation using
finite-state transducers and presents two main contributions to the field. One
of them is the use of finite-state machines for part of speech tagging.
Linguistic and statistical information is represented in terms of weights on
transitions in weighted finite-state transducers. Another contribution is the
successful combination of techniques -- linguistic and statistical -- for word
disambiguation, compounded with the notion of word classes.Comment: uses psfig, ipamac
- …