7,388 research outputs found

    Speech Recognition by Composition of Weighted Finite Automata

    Full text link
    We present a general framework based on weighted finite automata and weighted finite-state transducers for describing and implementing speech recognizers. The framework allows us to represent uniformly the information sources and data structures used in recognition, including context-dependent units, pronunciation dictionaries, language models and lattices. Furthermore, general but efficient algorithms can used for combining information sources in actual recognizers and for optimizing their application. In particular, a single composition algorithm is used both to combine in advance information sources such as language models and dictionaries, and to combine acoustic observations and information sources dynamically during recognition.Comment: 24 pages, uses psfig.st

    Use of Weighted Finite State Transducers in Part of Speech Tagging

    Full text link
    This paper addresses issues in part of speech disambiguation using finite-state transducers and presents two main contributions to the field. One of them is the use of finite-state machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on transitions in weighted finite-state transducers. Another contribution is the successful combination of techniques -- linguistic and statistical -- for word disambiguation, compounded with the notion of word classes.Comment: uses psfig, ipamac

    Lipschitz Robustness of Finite-state Transducers

    Get PDF
    We investigate the problem of checking if a finite-state transducer is robust to uncertainty in its input. Our notion of robustness is based on the analytic notion of Lipschitz continuity --- a transducer is K-(Lipschitz) robust if the perturbation in its output is at most K times the perturbation in its input. We quantify input and output perturbation using similarity functions. We show that K-robustness is undecidable even for deterministic transducers. We identify a class of functional transducers, which admits a polynomial time automata-theoretic decision procedure for K-robustness. This class includes Mealy machines and functional letter-to-letter transducers. We also study K-robustness of nondeterministic transducers. Since a nondeterministic transducer generates a set of output words for each input word, we quantify output perturbation using set-similarity functions. We show that K-robustness of nondeterministic transducers is undecidable, even for letter-to-letter transducers. We identify a class of set-similarity functions which admit decidable K-robustness of letter-to-letter transducers.Comment: In FSTTCS 201

    Regular Combinators for String Transformations

    Full text link
    We focus on (partial) functions that map input strings to a monoid such as the set of integers with addition and the set of output strings with concatenation. The notion of regularity for such functions has been defined using two-way finite-state transducers, (one-way) cost register automata, and MSO-definable graph transformations. In this paper, we give an algebraic and machine-independent characterization of this class analogous to the definition of regular languages by regular expressions. When the monoid is commutative, we prove that every regular function can be constructed from constant functions using the combinators of choice, split sum, and iterated sum, that are analogs of union, concatenation, and Kleene-*, respectively, but enforce unique (or unambiguous) parsing. Our main result is for the general case of non-commutative monoids, which is of particular interest for capturing regular string-to-string transformations for document processing. We prove that the following additional combinators suffice for constructing all regular functions: (1) the left-additive versions of split sum and iterated sum, which allow transformations such as string reversal; (2) sum of functions, which allows transformations such as copying of strings; and (3) function composition, or alternatively, a new concept of chained sum, which allows output values from adjacent blocks to mix.Comment: This is the full version, with omitted proofs and constructions, of the conference paper currently in submissio
    • …
    corecore