319 research outputs found
An Efficient Implementation of the Head-Corner Parser
This paper describes an efficient and robust implementation of a
bi-directional, head-driven parser for constraint-based grammars. This parser
is developed for the OVIS system: a Dutch spoken dialogue system in which
information about public transport can be obtained by telephone.
After a review of the motivation for head-driven parsing strategies, and
head-corner parsing in particular, a non-deterministic version of the
head-corner parser is presented. A memoization technique is applied to obtain a
fast parser. A goal-weakening technique is introduced which greatly improves
average case efficiency, both in terms of speed and space requirements.
I argue in favor of such a memoization strategy with goal-weakening in
comparison with ordinary chart-parsers because such a strategy can be applied
selectively and therefore enormously reduces the space requirements of the
parser, while no practical loss in time-efficiency is observed. On the
contrary, experiments are described in which head-corner and left-corner
parsers implemented with selective memoization and goal weakening outperform
`standard' chart parsers. The experiments include the grammar of the OVIS
system and the Alvey NL Tools grammar.
Head-corner parsing is a mix of bottom-up and top-down processing. Certain
approaches towards robust parsing require purely bottom-up processing.
Therefore, it seems that head-corner parsing is unsuitable for such robust
parsing techniques. However, it is shown how underspecification (which arises
very naturally in a logic programming environment) can be used in the
head-corner parser to allow such robust parsing techniques. A particular robust
parsing model is described which is implemented in OVIS.Comment: 31 pages, uses cl.st
Parsing Expression Grammars Made Practical
Parsing Expression Grammars (PEGs) define languages by specifying
recursive-descent parser that recognises them. The PEG formalism exhibits
desirable properties, such as closure under composition, built-in
disambiguation, unification of syntactic and lexical concerns, and closely
matching programmer intuition. Unfortunately, state of the art PEG parsers
struggle with left-recursive grammar rules, which are not supported by the
original definition of the formalism and can lead to infinite recursion under
naive implementations. Likewise, support for associativity and explicit
precedence is spotty. To remedy these issues, we introduce Autumn, a general
purpose PEG library that supports left-recursion, left and right associativity
and precedence rules, and does so efficiently. Furthermore, we identify infix
and postfix operators as a major source of inefficiency in left-recursive PEG
parsers and show how to tackle this problem. We also explore the extensibility
of the PEG paradigm by showing how one can easily introduce new parsing
operators and how our parser accommodates custom memoization and error handling
strategies. We compare our parser to both state of the art and battle-tested
PEG and CFG parsers, such as Rats!, Parboiled and ANTLR.Comment: "Proceedings of the International Conference on Software Language
Engineering (SLE 2015)" - 167-172 (ISBN : 978-1-4503-3686-4
TDL--- A Type Description Language for Constraint-Based Grammars
This paper presents \tdl, a typed feature-based representation language and
inference system. Type definitions in \tdl\ consist of type and feature
constraints over the boolean connectives. \tdl\ supports open- and closed-world
reasoning over types and allows for partitions and incompatible types. Working
with partially as well as with fully expanded types is possible. Efficient
reasoning in \tdl\ is accomplished through specialized modules.Comment: Will Appear in Proc. COLING-9
Pattern matching in compilers
In this thesis we develop tools for effective and flexible pattern matching.
We introduce a new pattern matching system called amethyst. Amethyst is not
only a generator of parsers of programming languages, but can also serve as an
alternative to tools for matching regular expressions.
Our framework also produces dynamic parsers. Its intended use is in the
context of IDE (accurate syntax highlighting and error detection on the fly).
Amethyst offers pattern matching of general data structures. This makes it a
useful tool for implementing compiler optimizations such as constant folding,
instruction scheduling, and dataflow analysis in general.
The parsers produced are essentially top-down parsers. Linear time complexity
is obtained by introducing the novel notion of structured grammars and
regularized regular expressions. Amethyst uses techniques known from compiler
optimizations to produce effective parsers.Comment: master thesi
- …