Search CORE

55 research outputs found

Practical LR Parser Generation

Author: Zimmerman Joe
Publication venue
Publication date: 17/09/2022
Field of study

Parsing is a fundamental building block in modern compilers, and for industrial programming languages, it is a surprisingly involved task. There are known approaches to generate parsers automatically, but the prevailing consensus is that automatic parser generation is not practical for real programming languages: LR/LALR parsers are considered to be far too restrictive in the grammars they support, and LR parsers are often considered too inefficient in practice. As a result, virtually all modern languages use recursive-descent parsers written by hand, a lengthy and error-prone process that dramatically increases the barrier to new programming language development. In this work we demonstrate that, contrary to the prevailing consensus, we can have the best of both worlds: for a very general, practical class of grammars -- a strict superset of Knuth's canonical LR -- we can generate parsers automatically, and the resulting parser code, as well as the generation procedure itself, is highly efficient. This advance relies on several new ideas, including novel automata optimization procedures; a new grammar transformation ("CPS"); per-symbol attributes; recursive-descent actions; and an extension of canonical LR parsing, which we refer to as XLR, which endows shift/reduce parsers with the power of bounded nondeterministic choice. With these ingredients, we can automatically generate efficient parsers for virtually all programming languages that are intuitively easy to parse -- a claim we support experimentally, by implementing the new algorithms in a new software tool called langcc, and running them on syntax specifications for Golang 1.17.8 and Python 3.9.12. The tool handles both languages automatically, and the generated code, when run on standard codebases, is 1.2x faster than the corresponding hand-written parser for Golang, and 4.3x faster than the CPython parser, respectively

arXiv.org e-Print Archive

Object-oriented LR(1) parser generation

Author: Luckett Christopher
Publication venue: RIT Scholar Works
Publication date: 01/01/2006
Field of study

The LR parser has been around for a long time, and its workings, especially with respect to table compaction and use of the lookahead sets, have puzzled students who are new to the area of study. The aim of this project therefore is to provide an object oriented approach and discoverable algorithm to ease the difficulty of mastering these concepts. This will be accomplished by distributing the table interpreter across objects whose inter-relationships create an analogue to the state table. Hopefully this will provide a greater degree of readability and ease of trace throughout the parser generation process

RIT Scholar Works

Reachability and error diagnosis in LR(1) automata

Author: Grosch Josef
Grune Dick
Gupta Kartik
Horning James J.
Parr Terence
Publication venue: HAL CCSD
Publication date: 01/01/2016
Field of study

National audienceGiven an LR(1) automaton, what are the states in which an error can be detected? For each such " error state " , what is a minimal input sentence that causes an error in this state? We propose an algorithm that answers these questions. Such an algorithm allows building a collection of pairs of an erroneous input sentence and a diagnostic message, ensuring that this collection covers every error state, and maintaining this property as the grammar evolves. We report on an application of this technique to the CompCert ISO C99 parser, and discuss its strengths and limitations

INRIA a CCSD electronic archive server