Search CORE

314 research outputs found

Grammars for Indentation-Sensitive Parsing

Author
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)
Publication date: 01/01/2017
Field of study

Adams\u27 extension of parsing expression grammars enables specifying indentation sensitivity using two non-standard grammar constructs - indentation by a binary relation and alignment. This paper is a theoretical study of Adams\u27 grammars. It proposes a step-by-step transformation of well-formed Adams\u27 grammars for elimination of the alignment construct from the grammar. The idea that alignment could be avoided was suggested by Adams but no process for achieving this aim has been described before. This paper also establishes general conditions that binary relations used in indentation constructs must satisfy in order to enable efficient parsing

Dagstuhl Research Online Publication Server

Top-Down Parsing with Parsing Contexts

Author: Kurs Jan
Lungu Mircea
Nierstrasz Oscar
Publication venue
Publication date: 01/01/2014
Field of study

Proceedings - University of Groningen

Parsing for agile modeling

Author: Kurš Jan
Publication venue: Universität Bern
Publication date: 01/01/2016
Field of study

Agile modeling refers to a set of methods that allow for a quick initial development of an importer and its further refinement. These requirements are not met simultaneously by the current parsing technology. Problems with parsing became a bottleneck in our research of agile modeling. In this thesis we introduce a novel approach to specify and build parsers. Our approach allows for expressive, tolerant and composable parsers without sacrificing performance. The approach is based on a context-sensitive extension of parsing expression grammars that allows a grammar engineer to specify complex language restrictions. To insure high parsing performance we automatically analyze a grammar definition and choose different parsing strategies for different parts of the grammar. We show that context-sensitive parsing expression grammars allow for highly composable, tolerant and variable-grained parsers that can be easily refined. Different parsing strategies significantly insure high-performance of parsers without sacrificing expressiveness of the underlying grammars

BORIS Theses

One Parser to Rule Them All

Author: Afroozeh A.
Afroozeh A.
Clarke K.
DeRemer F. L.
Erdweg S.
Johnson M.
Johnstone A.
M. G.
Tomita M.
Watt D. A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Despite the long history of research in parsing, constructing parsers for real programming languages remains a difficult and painful task. In the last decades, different parser generators emerged to allow the construction of parsers from a BNF-like specification. However, still today, many parsers are handwritten, or are only partly generated, and include various hacks to deal with different peculiarities in programming languages. The main problem is that current declarative syntax definition techniques are based on pure context-free grammars, while many constructs found in programming languages require context information. In this paper we propose a parsing framework that embraces context information in its core. Our framework is based on data-dependent grammars, which extend context-free grammars with arbitrary computation, variable binding and constraints. We present an implementation of our framework on top of the Generalized LL (GLL) parsing algorithm, and show how common idioms in syntax of programming languages such as (1) lexical disambiguation filters, (2) operator precedence, (3) indentation-sensitive rules, and (4) conditional preprocessor directives can be mapped to data-dependent grammars. We demonstrate the initial experience with our framework, by parsing more than 20000 Java, C#, Haskell, and OCaml source files

Crossref

CWI's Institutional Repository

INRIA a CCSD electronic archive server

Bounded seas

Author: Chomsky
Dean
Frost
Grune
Koppler
Kurš
Landin
Nilsson-Nyman
Scott
Tomita
van den Brand
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Abstract Imprecise manipulation of source code (semi-parsing) is useful for tasks such as robust parsing, error recovery, lexical analysis, and rapid development of parsers for data extraction. An island grammar precisely defines only a subset of a language syntax (islands), while the rest of the syntax (water) is defined imprecisely. Usually water is defined as the negation of islands. Albeit simple, such a definition of water is naive and impedes composition of islands. When developing an island grammar, sooner or later a language engineer has to create water tailored to each individual island. Such an approach is fragile, because water can change with any change of a grammar. It is time-consuming, because water is defined manually by an engineer and not automatically. Finally, an island surrounded by water cannot be reused because water has to be defined for every grammar individually. In this paper we propose a new technique of island parsing —- bounded seas. Bounded seas are composable, robust, reusable and easy to use because island-specific water is created automatically. Our work focuses on applications of island parsing to data extraction from source code. We have integrated bounded seas into a parser combinator framework as a demonstration of their composability and reusability

Proceedings - University of Groningen

Crossref

University of Groningen

ARTS repository - University of Groningen

Bern Open Repository and Information System (BORIS)

Dissertations of the University of Groningen

Morpheus: Automated Safety Verification of Data-Dependent Parser Combinator Programs

Author: Jagannathan Suresh
Mishra Ashish
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 37th European Conference on Object-Oriented Programming (ECOOP 2023)
Publication date: 01/01/2023
Field of study

Parser combinators are a well-known mechanism used for the compositional construction of parsers, and have shown to be particularly useful in writing parsers for rich grammars with data-dependencies and global state. Verifying applications written using them, however, has proven to be challenging in large part because of the inherently effectful nature of the parsers being composed and the difficulty in reasoning about the arbitrarily rich data-dependent semantic actions that can be associated with parsing actions. In this paper, we address these challenges by defining a parser combinator framework called Morpheus equipped with abstractions for defining composable effects tailored for parsing and semantic actions, and a rich specification language used to define safety properties over the constituent parsers comprising a program. Even though its abstractions yield many of the same expressivity benefits as other parser combinator systems, Morpheus is carefully engineered to yield a substantially more tractable automated verification pathway. We demonstrate its utility in verifying a number of realistic, challenging parsing applications, including several cases that involve non-trivial data-dependent relations

Dagstuhl Research Online Publication Server

Practical general top-down parsers

Author: Afroozeh A.
Izmaylova A.
Publication venue
Publication date: 01/01/2019
Field of study

International Migration, Integration and Social Cohesion online publications