31,445 research outputs found
Comparison of Context-free Grammars Based on Parsing Generated Test Data
There exist a number of software engineering scenarios that essentially involve equivalence or correspondence assertions for some of the context-free grammars in the scenarios. For instance, when applying grammar transformations during parser development---be it for the sake of disambiguation or grammar-class compliance---one would like to preserve the generated language. Even though equivalence is generally undecidable for context-free grammars, we have developed an automated approach that is practically useful in revealing evidence of nonequivalence of grammars and discovering correspondence mappings for grammar nonterminals. The approach is based on systematic test data generation and parsing. We discuss two studies that show how the approach is used in comparing grammars of open source Java parsers as well as grammars from the course work for a compiler construction class
Comparison of Context-free Grammars Based on Parsing Generated Test Data
There exist a number of software engineering scenarios that essentially involve equivalence or correspondence assertions for some of the context-free grammars in the scenarios. For instance, when applying grammar transformations during parser development—be it for the sake of disambiguation or grammar-class compliance—one would like to preserve the generated language. Even though equivalence is generally undecidable for context-free grammars, we have developed an automated approach that is practically useful in revealing evidence of nonequivalence of grammars and discovering correspondence mappings for grammar nonterminals. Our approach is based on systematic test data generation and parsing. We discuss two studies that show how the approach is used in comparing grammars of open source Java parsers as well as grammars from the course work for a compiler construction class
One Parser to Rule Them All
Despite the long history of research in parsing, constructing parsers for real programming languages remains a difficult and painful task. In the last decades, different parser generators emerged to allow the construction of parsers from a BNF-like specification. However, still today, many parsers are handwritten, or are only partly generated, and include various hacks to deal with different peculiarities in programming languages. The main problem is that current declarative syntax definition techniques are based on pure context-free grammars, while many constructs found in programming languages require context information.
In this paper we propose a parsing framework that embraces context information in its core. Our framework is based on data-dependent grammars, which extend context-free grammars with arbitrary computation, variable binding and constraints. We present an implementation of our framework on top of the Generalized LL (GLL) parsing algorithm, and show how common idioms in syntax of programming languages such as (1) lexical disambiguation filters, (2) operator precedence, (3) indentation-sensitive rules, and (4) conditional preprocessor directives can be mapped to data-dependent grammars. We demonstrate the initial experience with our framework, by parsing more than 20000 Java, C#, Haskell, and OCaml source files
Descriptional Complexity of Three-Nonterminal Scattered Context Grammars: An Improvement
Recently, it has been shown that every recursively enumerable language can be
generated by a scattered context grammar with no more than three nonterminals.
However, in that construction, the maximal number of nonterminals
simultaneously rewritten during a derivation step depends on many factors, such
as the cardinality of the alphabet of the generated language and the structure
of the generated language itself. This paper improves the result by showing
that the maximal number of nonterminals simultaneously rewritten during any
derivation step can be limited by a small constant regardless of other factors
Tightening the Complexity of Equivalence Problems for Commutative Grammars
We show that the language equivalence problem for regular and context-free
commutative grammars is coNEXP-complete. In addition, our lower bound
immediately yields further coNEXP-completeness results for equivalence problems
for communication-free Petri nets and reversal-bounded counter automata.
Moreover, we improve both lower and upper bounds for language equivalence for
exponent-sensitive commutative grammars.Comment: 21 page
- …