7,388 research outputs found
Speech Recognition by Composition of Weighted Finite Automata
We present a general framework based on weighted finite automata and weighted
finite-state transducers for describing and implementing speech recognizers.
The framework allows us to represent uniformly the information sources and data
structures used in recognition, including context-dependent units,
pronunciation dictionaries, language models and lattices. Furthermore, general
but efficient algorithms can used for combining information sources in actual
recognizers and for optimizing their application. In particular, a single
composition algorithm is used both to combine in advance information sources
such as language models and dictionaries, and to combine acoustic observations
and information sources dynamically during recognition.Comment: 24 pages, uses psfig.st
Use of Weighted Finite State Transducers in Part of Speech Tagging
This paper addresses issues in part of speech disambiguation using
finite-state transducers and presents two main contributions to the field. One
of them is the use of finite-state machines for part of speech tagging.
Linguistic and statistical information is represented in terms of weights on
transitions in weighted finite-state transducers. Another contribution is the
successful combination of techniques -- linguistic and statistical -- for word
disambiguation, compounded with the notion of word classes.Comment: uses psfig, ipamac
Lipschitz Robustness of Finite-state Transducers
We investigate the problem of checking if a finite-state transducer is robust
to uncertainty in its input. Our notion of robustness is based on the analytic
notion of Lipschitz continuity --- a transducer is K-(Lipschitz) robust if the
perturbation in its output is at most K times the perturbation in its input. We
quantify input and output perturbation using similarity functions. We show that
K-robustness is undecidable even for deterministic transducers. We identify a
class of functional transducers, which admits a polynomial time
automata-theoretic decision procedure for K-robustness. This class includes
Mealy machines and functional letter-to-letter transducers. We also study
K-robustness of nondeterministic transducers. Since a nondeterministic
transducer generates a set of output words for each input word, we quantify
output perturbation using set-similarity functions. We show that K-robustness
of nondeterministic transducers is undecidable, even for letter-to-letter
transducers. We identify a class of set-similarity functions which admit
decidable K-robustness of letter-to-letter transducers.Comment: In FSTTCS 201
Regular Combinators for String Transformations
We focus on (partial) functions that map input strings to a monoid such as
the set of integers with addition and the set of output strings with
concatenation. The notion of regularity for such functions has been defined
using two-way finite-state transducers, (one-way) cost register automata, and
MSO-definable graph transformations. In this paper, we give an algebraic and
machine-independent characterization of this class analogous to the definition
of regular languages by regular expressions. When the monoid is commutative, we
prove that every regular function can be constructed from constant functions
using the combinators of choice, split sum, and iterated sum, that are analogs
of union, concatenation, and Kleene-*, respectively, but enforce unique (or
unambiguous) parsing. Our main result is for the general case of
non-commutative monoids, which is of particular interest for capturing regular
string-to-string transformations for document processing. We prove that the
following additional combinators suffice for constructing all regular
functions: (1) the left-additive versions of split sum and iterated sum, which
allow transformations such as string reversal; (2) sum of functions, which
allows transformations such as copying of strings; and (3) function
composition, or alternatively, a new concept of chained sum, which allows
output values from adjacent blocks to mix.Comment: This is the full version, with omitted proofs and constructions, of
the conference paper currently in submissio
- …