Search CORE

29 research outputs found

Context in Parsing: Techniques and Applications

Author: Van Wyk Eric
Publication venue: OASIcs - OpenAccess Series in Informatics. Eelco Visser Commemorative Symposium (EVCS 2023)
Publication date: 01/01/2023
Field of study

Dagstuhl Research Online Publication Server

One Parser to Rule Them All

Author: Afroozeh A.
Afroozeh A.
Clarke K.
DeRemer F. L.
Erdweg S.
Johnson M.
Johnstone A.
M. G.
Tomita M.
Watt D. A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Despite the long history of research in parsing, constructing parsers for real programming languages remains a difficult and painful task. In the last decades, different parser generators emerged to allow the construction of parsers from a BNF-like specification. However, still today, many parsers are handwritten, or are only partly generated, and include various hacks to deal with different peculiarities in programming languages. The main problem is that current declarative syntax definition techniques are based on pure context-free grammars, while many constructs found in programming languages require context information. In this paper we propose a parsing framework that embraces context information in its core. Our framework is based on data-dependent grammars, which extend context-free grammars with arbitrary computation, variable binding and constraints. We present an implementation of our framework on top of the Generalized LL (GLL) parsing algorithm, and show how common idioms in syntax of programming languages such as (1) lexical disambiguation filters, (2) operator precedence, (3) indentation-sensitive rules, and (4) conditional preprocessor directives can be mapped to data-dependent grammars. We demonstrate the initial experience with our framework, by parsing more than 20000 Java, C#, Haskell, and OCaml source files

Crossref

CWI's Institutional Repository

INRIA a CCSD electronic archive server

Practical general top-down parsers

Author: Afroozeh A.
Izmaylova A.
Publication venue
Publication date: 01/01/2019
Field of study

International Migration, Integration and Social Cohesion online publications

Operator precedence for data-dependent grammars

Author: Afroozeh A.
Afroozeh A.
Clarke K.
Danielsson N.
DeRemer F. L.
Jim T.
Klint P.
Leroy X.
McPeak S.
Tomita M.
Visser E.
Visser E.
Visser E.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/01/2016
Field of study

Constructing parsers based on declarative specification of operator precedence is a very old research topic, and there are various existing approaches. However, these approaches are either tied to a particular parsing technique, or cannot deal with all corner cases found in programming languages. In this paper we present an implementation of declarative specification of operator precedence for general parsing that (1) is independent of the underlying parsing algorithm, (2) does not require any grammar transformation that increases the size of the grammar, (3) preserves the shape of parse trees of the original, natural grammar, and (4) can deal with intricate cases of operator precedence found in functional programming languages such as OCaml. Our new approach to operator precedence is formulated using data-dependent grammars, which extend context-free grammars with arbitrary computation, variable binding and constraints. We implemented our approach using Iguana, a data-dependent parsing framework, and evaluated it by parsing Java and OCaml source files. The results show that our approach is practical for parsing programming languages with complicated operator precedence rules

Crossref

CWI's Institutional Repository

Providing Mainstream Parser Generators with Modular Language Definition Support

Author: Karol Sven
Zschaler Steffen
Publication venue: Technische Universität Dresden
Publication date: 17/01/2012
Field of study

The composition and reuse of existing textual languages is a frequently re-occurring problem. One possibility of composing textual languages lies on the level of parser specifications which are mainly based on context-free grammars and regular expressions. Unfortunately most mainstream parser generators provide proprietary specification languages and usually do not provide strong abstractions for reuse. New forms of parser generators do support modular language development, but they can often not be easily integrated with existing legacy applications. To support modular language development based on mainstream parser generators, in this paper we apply the Invasive Software Composition (ISC) paradigm to parser specification languages by using our Reuseware framework. Our approach is grounded on a platform independent metamodel and thus does not rely on a specific parser generator

Technische Universität Dresden: Qucosa

Ambiguity Detection: Scaling to Scannerless

Author: Basten H.J.S. (Bas)
Klint P. (Paul)
Vinju J.J. (Jurgen)
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2011
Field of study

Static ambiguity detection would be an important aspect of language workbenches for textual software languages. However, the challenge is that automatic ambiguity detection in context-free grammars is undecidable in general. Sophisticated approximations and optimizations do exist, but these do not scale to grammars for so-called ``scannerless parsers'', as of yet. We extend previous work on ambiguity detection for context-free grammars to cover disambiguation techniques that are typical for scannerless parsing, such as longest match and reserved keywords. This paper contributes a new algorithm for ambiguity detection in character-level grammars, a prototype implementation of this algorithm and validation on several real grammars. The total run-time of ambiguity detection for character-level grammars for languages such as C and Java is significantly reduced, without loss of precision. The result is that efficient ambiguity detection in realistic grammars is possible and may therefore become a tool in language workbenches

CWI's Institutional Repository

INRIA a CCSD electronic archive server

Ambiguity Detection: Scaling to Scannerless

Author: Basten H.J.S. (Bas)
Klint P. (Paul)
Vinju J.J. (Jurgen)
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2011
Field of study

CWI's Institutional Repository

A unifying perspective on protocol mediation: interoperability in the Future Internet

Author: A Bennaceur
A Bennaceur
AC Schwerdfeger
B Spitznagel
D Garlan
D Lorenzoli
DM Yellin
G Wiederhold
H Basten
I Krka
L Cavallaro
N D’Ippolito
P Inverardi
R Mateescu
R Vaculín
SA McIlraith
V Issarny
V Issarny
V Issarny
Y D. Bromberg
YD Bromberg
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Given the highly dynamic and extremely heterogeneous software systems composing the Future Internet, automatically achieving interoperability between software components —without modifying them— is more than simply desirable, it is quickly becoming a necessity. Although much work has been carried out on interoperability, existing solutions have not fully succeeded in keeping pace with the increasing complexity and heterogeneity of modern software, and meeting the demands of runtime support. On the one hand, solutions at the application layer target higher automation and loose coupling through the synthesis of intermediary entities, mediators, to compensate for the differences between the interfaces of components and coordinate their behaviours, while assuming the use of the same middleware solution. On the other hand, solutions to interoperability across heterogeneous middleware technologies do not reconcile the differences between components at the application layer. In this paper we propose a unified approach for achieving interoperability between heterogeneous software components with compatible functionalities across the application and middleware layers. First, we provide a solution to automatically generate cross-layer parsers and composers that abstract network messages into a uniform representation independent of the middleware used. Second, these generated parsers and composers are integrated within a mediation framework to support the deployment of the mediators synthesised at the application layer. More specifically, the generated parser analyses the network messages received from one component and transforms them into a representation that can be understood by the application-level mediator. Then, the application-level mediator performs the necessary data conversion and behavioural coordination. Finally, the composer transforms the representation produced by the application-level mediator into network messages that can be sent to the other component. The resulting unified mediation framework reconciles the differences between software components from the application down to the middleware layers. We validate our approach through a case study in the area of conference management

Crossref

INRIA a CCSD electronic archive server

Open Research Online (The Open University)