9,985 research outputs found
Recovering Grammar Relationships for the Java Language Specification
Grammar convergence is a method that helps discovering relationships between
different grammars of the same language or different language versions. The key
element of the method is the operational, transformation-based representation
of those relationships. Given input grammars for convergence, they are
transformed until they are structurally equal. The transformations are composed
from primitive operators; properties of these operators and the composed chains
provide quantitative and qualitative insight into the relationships between the
grammars at hand. We describe a refined method for grammar convergence, and we
use it in a major study, where we recover the relationships between all the
grammars that occur in the different versions of the Java Language
Specification (JLS). The relationships are represented as grammar
transformation chains that capture all accidental or intended differences
between the JLS grammars. This method is mechanized and driven by nominal and
structural differences between pairs of grammars that are subject to
asymmetric, binary convergence steps. We present the underlying operator suite
for grammar transformation in detail, and we illustrate the suite with many
examples of transformations on the JLS grammars. We also describe the
extraction effort, which was needed to make the JLS grammars amenable to
automated processing. We include substantial metadata about the convergence
process for the JLS so that the effort becomes reproducible and transparent
Pattern matching in compilers
In this thesis we develop tools for effective and flexible pattern matching.
We introduce a new pattern matching system called amethyst. Amethyst is not
only a generator of parsers of programming languages, but can also serve as an
alternative to tools for matching regular expressions.
Our framework also produces dynamic parsers. Its intended use is in the
context of IDE (accurate syntax highlighting and error detection on the fly).
Amethyst offers pattern matching of general data structures. This makes it a
useful tool for implementing compiler optimizations such as constant folding,
instruction scheduling, and dataflow analysis in general.
The parsers produced are essentially top-down parsers. Linear time complexity
is obtained by introducing the novel notion of structured grammars and
regularized regular expressions. Amethyst uses techniques known from compiler
optimizations to produce effective parsers.Comment: master thesi
Contributions to the Construction of Extensible Semantic Editors
This dissertation addresses the need for easier construction and extension of language tools. Specifically, the construction and extension of so-called semantic editors is considered, that is, editors providing semantic services for code comprehension and manipulation. Editors like these are typically found in state-of-the-art development environments, where they have been developed by hand. The list of programming languages available today is extensive and, with the lively creation of new programming languages and the evolution of old languages, it keeps growing. Many of these languages would benefit from proper tool support. Unfortunately, the development of a semantic editor can be a time-consuming and error-prone endeavor, and too large an effort for most language communities. Given the complex nature of programming, and the huge benefits of good tool support, this lack of tools is problematic. In this dissertation, an attempt is made at narrowing the gap between generative solutions and how state-of-the-art editors are constructed today. A generative alternative for construction of textual semantic editors is explored with focus on how to specify extensible semantic editor services. Specifically, this dissertation shows how semantic services can be specified using a semantic formalism called refer- ence attribute grammars (RAGs), and how these services can be made responsive enough for editing, and be provided also when the text in an editor is erroneous. Results presented in this dissertation have been found useful, both in industry and in academia, suggesting that the explored approach may help to reduce the effort of editor construction
Syntax Error Handling in Scannerless Generalized LR Parsers
This thesis is about a master's project as part of the one year master study
'Software-engineering'. This project is about methods for improving the quality
of reporting and handling of syntax errors that are produced by a scannerless
generalized left-to-right rightmost (SGLR) parser, and is done at Centrum voor
Wiskunde en Informatica (CWI) in Amsterdam.
SGLR is a parsing algorithm developed as part of Generic Language Technol-
ogy Project at SEN1, one of the themes at CWI. SGLR is based on the GLR
algorithm developed by Tomita.
SGLR parsers are able to recognize arbitrary context-free grammars, which
enables grammar modularization. Because SGLR does not use a separate scan-
ner, also layout and comments are incorporated into the parse tree. This makes
SGLR a powerful tool for code analysis and code transformations. A drawback
is the way SGLR handles syntax errors.
When a syntax error is detected, the current implementation of SGLR halts the
parsing process and reports back to the user the point of error detection only.
The text at the point of error detection is not necessarily the text that has to
be changed to repair the error.
This thesis describes three kinds of information that could be reported to the
user, and how they could be derived from the parse process when an error is
detected. These are:
- The structure of the already parsed part of the input in the form of a partial
parse tree.
- A listing of expected symbols; those tokens or token sequences that are accept-
able instead of the erroneous text.
- The current parser state which could be translated into language dependent
informative messages.
Also two ways of recovering from an error condition are described. These are
non-correcting recovery methods that enable SGLR to always return a parse
tree that can be unparsed into the original input sentence.
- A method that halts parsing but incorporates the remainder of the input into
the parse tree.
- A method that resumes parsing by means of substring parsing.
During the course of the project the described approaches have been imple-
mented and incorporated in the implementation of SGLR as used by the Meta-
Environment, some fully, some more or less prototyped
- …