13,668 research outputs found
TypEx : a type based approach to XML stream querying
We consider the topic of query evaluation over semistructured information streams, and XML data streams in particular. Streaming evaluation methods are necessarily eventdriven, which is in tension with high-level query models; in general, the more expressive the query language, the harder it is to translate queries into an event-based implementation with finite resource bounds
Speech Recognition by Composition of Weighted Finite Automata
We present a general framework based on weighted finite automata and weighted
finite-state transducers for describing and implementing speech recognizers.
The framework allows us to represent uniformly the information sources and data
structures used in recognition, including context-dependent units,
pronunciation dictionaries, language models and lattices. Furthermore, general
but efficient algorithms can used for combining information sources in actual
recognizers and for optimizing their application. In particular, a single
composition algorithm is used both to combine in advance information sources
such as language models and dictionaries, and to combine acoustic observations
and information sources dynamically during recognition.Comment: 24 pages, uses psfig.st
On the Disambiguation of Weighted Automata
We present a disambiguation algorithm for weighted automata. The algorithm
admits two main stages: a pre-disambiguation stage followed by a transition
removal stage. We give a detailed description of the algorithm and the proof of
its correctness. The algorithm is not applicable to all weighted automata but
we prove sufficient conditions for its applicability in the case of the
tropical semiring by introducing the *weak twins property*. In particular, the
algorithm can be used with all acyclic weighted automata, relevant to
applications. While disambiguation can sometimes be achieved using
determinization, our disambiguation algorithm in some cases can return a result
that is exponentially smaller than any equivalent deterministic automaton. We
also present some empirical evidence of the space benefits of disambiguation
over determinization in speech recognition and machine translation
applications
Efficient Tabular LR Parsing
We give a new treatment of tabular LR parsing, which is an alternative to
Tomita's generalized LR algorithm. The advantage is twofold. Firstly, our
treatment is conceptually more attractive because it uses simpler concepts,
such as grammar transformations and standard tabulation techniques also know as
chart parsing. Secondly, the static and dynamic complexity of parsing, both in
space and time, is significantly reduced.Comment: 8 pages, uses aclap.st
- …