2 research outputs found
Practical LR Parser Generation
Parsing is a fundamental building block in modern compilers, and for
industrial programming languages, it is a surprisingly involved task. There are
known approaches to generate parsers automatically, but the prevailing
consensus is that automatic parser generation is not practical for real
programming languages: LR/LALR parsers are considered to be far too restrictive
in the grammars they support, and LR parsers are often considered too
inefficient in practice. As a result, virtually all modern languages use
recursive-descent parsers written by hand, a lengthy and error-prone process
that dramatically increases the barrier to new programming language
development.
In this work we demonstrate that, contrary to the prevailing consensus, we
can have the best of both worlds: for a very general, practical class of
grammars -- a strict superset of Knuth's canonical LR -- we can generate
parsers automatically, and the resulting parser code, as well as the generation
procedure itself, is highly efficient. This advance relies on several new
ideas, including novel automata optimization procedures; a new grammar
transformation ("CPS"); per-symbol attributes; recursive-descent actions; and
an extension of canonical LR parsing, which we refer to as XLR, which endows
shift/reduce parsers with the power of bounded nondeterministic choice.
With these ingredients, we can automatically generate efficient parsers for
virtually all programming languages that are intuitively easy to parse -- a
claim we support experimentally, by implementing the new algorithms in a new
software tool called langcc, and running them on syntax specifications for
Golang 1.17.8 and Python 3.9.12. The tool handles both languages automatically,
and the generated code, when run on standard codebases, is 1.2x faster than the
corresponding hand-written parser for Golang, and 4.3x faster than the CPython
parser, respectively
User support for software development technologies
The adoption of software development technologies is very closely related to the topic
of user support. This is especially true in early phases, when the users are not familiar
with the modification or the build processes of the software that has to be developed nor
with the technology used for software development. This work introduces an approach
to improve the usability of software development technologies represented by the Combinatory
Logic Synthesizer (CL)S Framework. (CL)S is based on a type inhabitation
algorithm for the combinatory logic with intersection types and aims to automatically
create software components from a domain-specified repository. The framework yields
a complete enumeration of all inhabitants. The inhabitation results are computed in
the form of tree grammars. Unfortunately, the underlying type system allows limited
application of domain-specific knowledge. To compensate for this limit, this work provides
a framework for debugging intersection type specifications and filtering inhabitation
results using domain-specific constraints as main aspects. The aim of the debugger is
to make potentially incomplete or erroneous input specifications and decisions of the
inhabitation algorithm understandable for those who are not experts in the field of type
theory. The combination of tree grammars and graph theory forms the foundation of a
clear representation of the computed results that informs users about the search process
of the algorithm. The graphical representations are based on hypergraphs that illustrate
the inhabitation in a step-wise fashion. Within the scope of this work, three filtering algorithms
were implemented and investigated. The filtering algorithm integrated into the
framework for user support and used for the restriction of inhabitation results is practically
feasible and represents a clear improvement compared to existing approaches. It is
based on modifying the tree grammars resulting from the (CL)S Framework. Additionally,
the usability of the (CL)S Framework is supported by eight perspectives included in a
web-based integrated development environment (IDE) that provides detailed graphical
and textual information about the synthesis