1,426 research outputs found
The ModelCC Model-Driven Parser Generator
Syntax-directed translation tools require the specification of a language by
means of a formal grammar. This grammar must conform to the specific
requirements of the parser generator to be used. This grammar is then annotated
with semantic actions for the resulting system to perform its desired function.
In this paper, we introduce ModelCC, a model-based parser generator that
decouples language specification from language processing, avoiding some of the
problems caused by grammar-driven parser generators. ModelCC receives a
conceptual model as input, along with constraints that annotate it. It is then
able to create a parser for the desired textual syntax and the generated parser
fully automates the instantiation of the language conceptual model. ModelCC
also includes a reference resolution mechanism so that ModelCC is able to
instantiate abstract syntax graphs, rather than mere abstract syntax trees.Comment: In Proceedings PROLE 2014, arXiv:1501.0169
Generalizing input-driven languages: theoretical and practical benefits
Regular languages (RL) are the simplest family in Chomsky's hierarchy. Thanks
to their simplicity they enjoy various nice algebraic and logic properties that
have been successfully exploited in many application fields. Practically all of
their related problems are decidable, so that they support automatic
verification algorithms. Also, they can be recognized in real-time.
Context-free languages (CFL) are another major family well-suited to
formalize programming, natural, and many other classes of languages; their
increased generative power w.r.t. RL, however, causes the loss of several
closure properties and of the decidability of important problems; furthermore
they need complex parsing algorithms. Thus, various subclasses thereof have
been defined with different goals, spanning from efficient, deterministic
parsing to closure properties, logic characterization and automatic
verification techniques.
Among CFL subclasses, so-called structured ones, i.e., those where the
typical tree-structure is visible in the sentences, exhibit many of the
algebraic and logic properties of RL, whereas deterministic CFL have been
thoroughly exploited in compiler construction and other application fields.
After surveying and comparing the main properties of those various language
families, we go back to operator precedence languages (OPL), an old family
through which R. Floyd pioneered deterministic parsing, and we show that they
offer unexpected properties in two fields so far investigated in totally
independent ways: they enable parsing parallelization in a more effective way
than traditional sequential parsers, and exhibit the same algebraic and logic
properties so far obtained only for less expressive language families
SLR inference: An inference system for fixed-mode logic programs, based on SLR parsing
AbstractDefinite-clause grammars (DCGs) generalize context-free grammars in such a way that Prolog can be used as a parser in the presence of context-sensitive information. Prolog's proof procedure, however, is based on backtracking, which may be a source of inefficiency. Parsers for context-free grammars that use backtracking, for instance, were soon replaced by more efficient methods, such as LR parsers. This suggests incorporating the principles underlying LR parsing into a parser for grammars with context-sensitive information. We present a technique that applies a transformation to the program/grammar by adding leaves to the proof/parse trees and placing the contextual information in such leaves. An inference system is then easily obtained from an LR parser, since only the parts dealing with terminals (which appear at the leaves) must be modified. Although our method is restricted to programs with fixed modes, it may be preferable to DCGs under Prolog for some programs
Compiler Design: Theory, Tools, and Examples
Compiler design is a subject which many believe to be fundamental and vital to computer science. It is a subject which has been studied intensively since the early 1950’s and continues to be an important research field today. Compiler design is an important part of the undergraduate curriculum for many reasons: (1) It provides students with a better understanding of and appreciation for programming languages. (2) The techniques used in compilers can be used in other applications with command languages. (3) It provides motivation for the study of theoretic topics. (4) It is a good vehicle for an extended programming project.
There are several compiler design textbooks available today, but most have been written for graduate students. Here at Rowan University, our students have had difficulty reading these books. However, I felt it was not the subject matter that was the problem, but the way it was presented. I was sure that if concepts were presented at a slower pace, with sample problems and diagrams to illustrate the concepts, that our students would be able to master the concepts. This is what I have attempted to do in writing this book.https://rdw.rowan.edu/oer/1001/thumbnail.jp
PSLR(1): Pseudo-Scannerless Minimal LR(1) for the Deterministic Parsing of Composite Languages
Composite languages are composed of multiple sub-languages. Examples include the parser specification languages read by parser generators like Yacc, modern extensible languages with complex layers of domain-specific sub-languages, and even traditional programming languages like C and C++. In this dissertation, we describe PSLR(1), a new scanner-based LR(1) parser generation system that automatically eliminates scanner conflicts typically caused by language composition. The fundamental premise of PSLR(1) is the pseudo-scanner, a scanner that only recognizes tokens accepted by the current parser state. However, use of the pseudo-scanner raises several unique challenges, for which we describe a novel set of solutions. One major challenge is that practical LR(1) parser table generation algorithms merge parser states, sometimes inducing incorrect pseudo-scanner behavior including new conflicts. Our solution is a new extension of IELR(1), an algorithm we have previously described for generating minimal LR(1) parser tables. Other contributions of our work include a robust system for handling the remaining scanner conflicts, a correction for syntax error handling mechanisms that are also corrupted by parser state merging, and a mechanism to enable scoping of syntactic declarations in order to further improve the modularity of sub-language specifications. While the premise of the pseudo-scanner has been described by other researchers independently, we expect our improvements to distinguish PSLR(1) as a significantly more robust scanner-based parser generation system for traditional and modern composite languages
- …