136 research outputs found
LR(k) sparse-parsers and their optimisation
PhD ThesisA method of syntactic analysis is developed which . .
is believed to surpass all known competitors in all major
respects.
I
The method is based upon that associated with the
LR(k) grammars but is faster because it bypasses all
reduction steps concerned with 'chain' productions. These
are freely selected productions which are considered
semantically irrelevant and whose right parts consist of
just a single symbol. The parses produced by the method
are 'sparse' in that they contain no references to chain
productions - they are termed 'chain-free' parses.
The CFLR(k) grammars are introduced as the largest
class which can be -Chain-F-ree parsed from -Le-ft to Right while looking ~ symbols ahead of the current point of the
parse. The properties of these grammars are examined in
detail and their relationship to the conventional LR(k)
grammars is explored. Techniques are presented for testing
grammars for the CFLR(k) property and for constructing
chain-free parsers for those grammars possessing the
property. Methods are also presented for. converting
ordinary LR(k) parsers into chain-free parsers.
CFLR(k) parsers are more widely applicable than
their LR(k) counterparts, are faster 'and provide the same
excellent detection of syntactic errors. Unfortunately they
also tend to be rather larger. A 'simple optimization is
presented which completely'overcomes this single disadvantage
without sacrificing any of the advantages of the
method.
These theoretical techniques are adapted to provide
truly practical chain-free parsers based on the conventional
SLR and,LALR parsing methods. Detailed consideration
is given to use of 'default reductions' and related
techniques for achd.evfng compact representations of these
parsers. The resulting chain-free parsers are not only
faster than their ordinary counterparts, but probably
smaller too. We believe their advantages are such that they
should substantially replace other parsing methods currently
used in programming language compilers
Parallel parsing made practical
The property of local parsability allows to parse inputs through inspecting only a bounded-length string around the current token. This in turn enables the construction of a scalable, data-parallel parsing algorithm, which is presented in this work. Such an algorithm is easily amenable to be automatically generated via a parser generator tool, which was realized, and is also presented in the following. Furthermore, to complete the framework of a parallel input analysis, a parallel scanner can also combined with the parser. To prove the practicality of a parallel lexing and parsing approach, we report the results of the adaptation of JSON and Lua to a form fit for parallel parsing (i.e. an operator-precedence grammar) through simple grammar changes and scanning transformations. The approach is validated with performance figures from both high performance and embedded multicore platforms, obtained analyzing real-world inputs as a test-bench. The results show that our approach matches or dominates the performances of production-grade LR parsers in sequential execution, and achieves significant speedups and good scaling on multi-core machines. The work is concluded by a broad and critical survey of the past work on parallel parsing and future directions on the integration with semantic analysis and incremental parsing
A survey of compiler development aids
A theoretical background was established for the compilation process by dividing it into five phases and explaining the concepts and algorithms that underpin each. The five selected phases were lexical analysis, syntax analysis, semantic analysis, optimization, and code generation. Graph theoretical optimization techniques were presented, and approaches to code generation were described for both one-pass and multipass compilation environments. Following the initial tutorial sections, more than 20 tools that were developed to aid in the process of writing compilers were surveyed. Eight of the more recent compiler development aids were selected for special attention - SIMCMP/STAGE2, LANG-PAK, COGENT, XPL, AED, CWIC, LIS, and JOCIT. The impact of compiler development aids were assessed some of their shortcomings and some of the areas of research currently in progress were inspected
A Drop-in Replacement for LR(1) Table-Driven Parsing
This paper presents a construction method for a deterministic one-symbol look-ahead LR parser which allows non-terminals in the parser look-ahead. This effectively relaxes the requirement of parsing the reverse of the right-most derivation of a string/sentence. This is achieved by replacing the deterministic push down automata of LR parsing by a two-stack automata. The class of grammars accepted by the two-stack parser properly contains the LR(k) grammars. Since the modification to the table-driven LR parsing process is relatively minor and mostly impacts the creation of the goto and action tables, a parser modified to adopt the two-stack process should be comparable in size and performance to LR parsers.</p
File compression using probabilistic grammars and LR parsing
Data compression, the reduction in size of the physical representation
of data being stored or transmitted, has long been of interest both as a research topic and as a practical technique. Different methods are used
for encoding different classes of data files. The purpose of this research
is to compress a class of highly redundant data files whose contents are
partially described by a context-free grammar (i.e. text files containing
computer programs).
An encoding technique is developed for the removal of structural
dependancy due to the context-free structure of such files. The technique
depends on a type of LR parsing method called LALR(K) (Lookahead LRM).
The encoder also pays particular attention to the encoding of editing
characters, comments, names and constants.
The encoded data maintains the exact information content of the
original data. Hence, a decoding technique (depending on the same
parsing method) is developed to recover the original information from
its compressed representation.
The technique is demonstrated by compressing Pascal programs. An
optimal coding scheme (based on Huffman codes) is used to encode the
parsing alternatives in each parsing state. The decoder uses these codes
during the decoding phase. Also Huffman codes, based on the probability
of the symbols c oncerned, are used when coding editing characterst
comments, names and constants. The sizes of the parsing tables (and
subsequently the encoding tables) were considerably reduced by splitting
them into a number of sub-tables.
The minimum and the average code length of the average program are
derived from two different matrices. These matrices are constructed
from a probabilistic grammar, and the language generated by this grammar.
Finally, various comparisons are made with a related encoding method by
using a simple context-free language
Contributions to the Construction of Extensible Semantic Editors
This dissertation addresses the need for easier construction and extension of language tools. Specifically, the construction and extension of so-called semantic editors is considered, that is, editors providing semantic services for code comprehension and manipulation. Editors like these are typically found in state-of-the-art development environments, where they have been developed by hand. The list of programming languages available today is extensive and, with the lively creation of new programming languages and the evolution of old languages, it keeps growing. Many of these languages would benefit from proper tool support. Unfortunately, the development of a semantic editor can be a time-consuming and error-prone endeavor, and too large an effort for most language communities. Given the complex nature of programming, and the huge benefits of good tool support, this lack of tools is problematic. In this dissertation, an attempt is made at narrowing the gap between generative solutions and how state-of-the-art editors are constructed today. A generative alternative for construction of textual semantic editors is explored with focus on how to specify extensible semantic editor services. Specifically, this dissertation shows how semantic services can be specified using a semantic formalism called refer- ence attribute grammars (RAGs), and how these services can be made responsive enough for editing, and be provided also when the text in an editor is erroneous. Results presented in this dissertation have been found useful, both in industry and in academia, suggesting that the explored approach may help to reduce the effort of editor construction
Automata theory and formal languages
These lecture notes present some basic notions and results on Automata Theory,
Formal Languages Theory, Computability Theory, and Parsing Theory. I prepared
these notes for a course on Automata, Languages, and Translators which I am
teaching at the University of Roma Tor Vergata. More material on these topics and
on parsing techniques for context-free languages can be found in standard textbooks
such as [1, 8, 9]. The reader is encouraged to look at those books.
A theorem denoted by the triple k.m.n is in Chapter k and Section m, and within
that section it is identified by the number n. Analogous numbering system is used
for algorithms, corollaries, definitions, examples, exercises, figures, and remarks. We
use ‘iff’ to mean ‘if and only if’.
Many thanks to my colleagues of the Department of Informatics, Systems, and
Production of the University of Roma Tor Vergata. I am also grateful to my stu-
dents and co-workers and, in particular, to Lorenzo Clemente, Corrado Di Pietro,
Fulvio Forni, Fabio Lecca, Maurizio Proietti, and Valerio Senni for their help and
encouragement.
Finally, I am grateful to Francesca Di Benedetto, Alessandro Colombo, Donato
Corvaglia, Gioacchino Onorati, and Leonardo Rinaldi of the Aracne Publishing Com-
pany for their kind cooperation
Compiler Front-end for the IEC 61131-3 v3 Languages
Análises Léxica e Sintática concluÃdas. Abstract Syntax Treequase completa. Falta validar o trabalho. Falta concluir o documento
- …