1,066 research outputs found
PSLR(1): Pseudo-Scannerless Minimal LR(1) for the Deterministic Parsing of Composite Languages
Composite languages are composed of multiple sub-languages. Examples include the parser specification languages read by parser generators like Yacc, modern extensible languages with complex layers of domain-specific sub-languages, and even traditional programming languages like C and C++. In this dissertation, we describe PSLR(1), a new scanner-based LR(1) parser generation system that automatically eliminates scanner conflicts typically caused by language composition. The fundamental premise of PSLR(1) is the pseudo-scanner, a scanner that only recognizes tokens accepted by the current parser state. However, use of the pseudo-scanner raises several unique challenges, for which we describe a novel set of solutions. One major challenge is that practical LR(1) parser table generation algorithms merge parser states, sometimes inducing incorrect pseudo-scanner behavior including new conflicts. Our solution is a new extension of IELR(1), an algorithm we have previously described for generating minimal LR(1) parser tables. Other contributions of our work include a robust system for handling the remaining scanner conflicts, a correction for syntax error handling mechanisms that are also corrupted by parser state merging, and a mechanism to enable scoping of syntactic declarations in order to further improve the modularity of sub-language specifications. While the premise of the pseudo-scanner has been described by other researchers independently, we expect our improvements to distinguish PSLR(1) as a significantly more robust scanner-based parser generation system for traditional and modern composite languages
Syntactic analysis of LR(k) languages
PhD ThesisA method of syntactic analysis, termed LA(m)LR(k), is discussed
theoretically. Knuth's LR(k) algorithm is included as the special
case m = k. A simpler variant, SLA(m)LR(k) is also described, which
in the case SLA(k)LR(O) is equivalent to the SLR(k) algorithm as
defined by DeRemer. Both variants have the LR(k) property of
immediate detection of syntactic errors.
The case m = 1 k = 0 is examined in detail, when the methods
provide a practical parsing technique of greater generality than
precedence methods in current use. A formal comparison is made with
the weak precedence algorithm.
The implementation of an SLA(1)LR(O) parser (SLR) is described,
involving numerous space and time optimisations. Of importance is a
technique for bypassing unnecessary steps in a syntactic derivation.
Direct comparisons are made, primarily with the simple precedence
parser of the highly efficient Stanford AlgolW compiler, and confirm
the practical feasibility of the SLR parser.The Science Research Council
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities
We describe an extension of Earley's parser for stochastic context-free
grammars that computes the following quantities given a stochastic context-free
grammar and an input string: a) probabilities of successive prefixes being
generated by the grammar; b) probabilities of substrings being generated by the
nonterminals, including the entire string being generated by the grammar; c)
most likely (Viterbi) parse of the string; d) posterior expected number of
applications of each grammar production, as required for reestimating rule
probabilities. (a) and (b) are computed incrementally in a single left-to-right
pass over the input. Our algorithm compares favorably to standard bottom-up
parsing methods for SCFGs in that it works efficiently on sparse grammars by
making use of Earley's top-down control structure. It can process any
context-free rule format without conversion to some normal form, and combines
computations for (a) through (d) in a single algorithm. Finally, the algorithm
has simple extensions for processing partially bracketed inputs, and for
finding partial parses and their likelihoods on ungrammatical inputs.Comment: 45 pages. Slightly shortened version to appear in Computational
Linguistics 2
- …