7,413 research outputs found
Probabilistic mathematical formula recognition using a 2D context-free graph grammar
We present a probabilistic framework for the mathematical expression recognition problem. The developed system is flexible in that its grammar can be extended easily thanks to its graph grammar which eliminates the need for specifying rule precedence. It is also optimal in the sense that all possible interpretations of the expressions are expanded without making early commitments or hard decisions. In this paper, we give an overview of the whole system and describe in detail the graph grammar and the parsing process used in the system, along with some preliminary results on character, structure and expression recognition performances
The Unsupervised Acquisition of a Lexicon from Continuous Speech
We present an unsupervised learning algorithm that acquires a
natural-language lexicon from raw speech. The algorithm is based on the optimal
encoding of symbol sequences in an MDL framework, and uses a hierarchical
representation of language that overcomes many of the problems that have
stymied previous grammar-induction procedures. The forward mapping from symbol
sequences to the speech stream is modeled using features based on articulatory
gestures. We present results on the acquisition of lexicons and language models
from raw speech, text, and phonetic transcripts, and demonstrate that our
algorithm compares very favorably to other reported results with respect to
segmentation performance and statistical efficiency.Comment: 27 page technical repor
An Alternative Conception of Tree-Adjoining Derivation
The precise formulation of derivation for tree-adjoining grammars has
important ramifications for a wide variety of uses of the formalism, from
syntactic analysis to semantic interpretation and statistical language
modeling. We argue that the definition of tree-adjoining derivation must be
reformulated in order to manifest the proper linguistic dependencies in
derivations. The particular proposal is both precisely characterizable through
a definition of TAG derivations as equivalence classes of ordered derivation
trees, and computationally operational, by virtue of a compilation to linear
indexed grammars together with an efficient algorithm for recognition and
parsing according to the compiled grammar.Comment: 33 page
- …