2 research outputs found
Memoisation: Purely, Left-recursively, and with (Continuation Passing) Style
Memoisation, or tabling, is a well-known technique that yields large
improvements in the performance of some recursive computations. Tabled
resolution in Prologs such as XSB and B-Prolog can transform so called
left-recursive predicates from non-terminating computations into finite and
well-behaved ones. In the functional programming literature, memoisation has
usually been implemented in a way that does not handle left-recursion,
requiring supplementary mechanisms to prevent non-termination. A notable
exception is Johnson's (1995) continuation passing approach in Scheme. This,
however, relies on mutation of a memo table data structure and coding in
explicit continuation passing style. We show how Johnson's approach can be
implemented purely functionally in a modern, strongly typed functional language
(OCaml), presented via a monadic interface that hides the implementation
details, yet providing a way to return a compact represention of the memo
tables at the end of the computation
Robust Probabilistic Predictive Syntactic Processing
This thesis presents a broad-coverage probabilistic top-down parser, and its
application to the problem of language modeling for speech recognition. The
parser builds fully connected derivations incrementally, in a single pass from
left-to-right across the string. We argue that the parsing approach that we
have adopted is well-motivated from a psycholinguistic perspective, as a model
that captures probabilistic dependencies between lexical items, as part of the
process of building connected syntactic structures. The basic parser and
conditional probability models are presented, and empirical results are
provided for its parsing accuracy on both newspaper text and spontaneous
telephone conversations. Modifications to the probability model are presented
that lead to improved performance. A new language model which uses the output
of the parser is then defined. Perplexity and word error rate reduction are
demonstrated over trigram models, even when the trigram is trained on
significantly more data. Interpolation on a word-by-word basis with a trigram
model yields additional improvements.Comment: Ph.D. Thesis, Brown University, Advisor: Mark Johnson. 140 pages, 40
figures, 27 table