Search CORE

84 research outputs found

Purely functional GLL parsing

Author: Johnstone Adrian
Scott Elizabeth
van Binsbergen L. Thomas
Publication venue: 'Elsevier BV'
Publication date: 01/06/2020
Field of study

Generalised parsing has become increasingly important in the context of software language design and several compiler generators and language workbenches have adopted generalised parsing algorithms such as GLR and GLL. The original GLL parsing algorithms are described in low-level pseudo-code as the output of a parser generator. This paper explains GLL parsing differently, defining the FUN-GLL algorithm as a collection of pure, mathematical functions and focussing on the logic of the algorithm by omitting implementation details. In particular, the data structures are modelled by abstract sets and relations rather than specialised implementations. The description is further simplified by omitting lookahead and adopting the binary subtree representation of derivations to avoid the clerical overhead of graph construction. Conventional parser combinators inherit the drawbacks from the recursive descent algorithms they implement. Based on FUN-GLL, this paper defines generalised parser combinators that overcome these problems. Th

CWI's Institutional Repository

Royal Holloway - Pure

GLL parsing with flexible combinators

Author: Johnstone Adrian
Scott Elizabeth
van Binsbergen L. Thomas
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2018
Field of study

At SLE in 2014, Ridge presented the P3 combinator library with which parsers can be developed for left-recursive, non-deterministic and ambiguous grammars. A combinator expression in P3 yields a binarised grammar reflecting the expression's structure. The grammar is given to an underlying, generalised parsing procedure computing all derivations. In this paper we present a combinator library with a similar architecture to P3, adjusting it to avoid grammar binarisation. Avoiding binarisation has a significant positive effect on the running times of the underlying parsing procedure, which we demonstrate using real-world grammars. Binarisation is avoided by restricting the applicability of combinators, resulting in combinator expressions closely resembling BNF fragments. Usability is recovered by defining coercions that automatically convert expressions where necessary. As the underlying parsing procedure, we use a purely functional variant of generalised top-down (GLL) parsing

CWI's Institutional Repository

Royal Holloway - Pure

Happy-GLL: modular, reusable and complete top-down parsers for parameterized nonterminals

Author: Frolich Damian
van Binsbergen L. Thomas
Publication venue
Publication date: 14/03/2023
Field of study

Parser generators and parser combinator libraries are the most popular tools for producing parsers. Parser combinators use the host language to provide reusable components in the form of higher-order functions with parsers as parameters. Very few parser generators support this kind of reuse through abstraction and even fewer generate parsers that are as modular and reusable as the parts of the grammar for which they are produced. This paper presents a strategy for generating modular, reusable and complete top-down parsers from syntax descriptions with parameterized nonterminals, based on the FUN-GLL variant of the GLL algorithm. The strategy is discussed and demonstrated as a novel back-end for the Happy parser generator. Happy grammars can contain `parameterized nonterminals' in which parameters abstract over grammar symbols, granting an abstraction mechanism to define reusable grammar operators. However, the existing Happy back-ends do not deliver on the full potential of parameterized nonterminals as parameterized nonterminals cannot be reused across grammars. Moreover, the parser generation process may fail to terminate or may result in exponentially large parsers generated in an exponential amount of time. The GLL back-end presented in this paper implements parameterized nonterminals successfully by generating higher-order functions that resemble parser combinators, inheriting all the advantages of top-down parsing. The back-end is capable of generating parsers for the full class of context-free grammars, generates parsers in linear time and generates parsers that find all derivations of the input string. To our knowledge, the presented GLL back-end makes Happy the first parser generator that combines all these features. This paper describes the translation procedure of the GLL back-end and compares it to the LALR and GLR back-ends of Happy in several experiments.Comment: 15 page

arXiv.org e-Print Archive

Derivation representation using binary subtree sets

Author: Johnstone Adrian
Scott Elizabeth
van Binsbergen L. Thomas
Publication venue: 'Elsevier BV'
Publication date: 15/04/2019
Field of study

Royal Holloway - Pure

Island Grammar-based Parsing using GLL and Tom

Author: Afroozeh Ali
Bach Jean-Christophe
Johnstone Adrian
Manders Maarten
Moreau Pierre-Etienne
Scott Elizabeth
Van Den Brand Mark
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

International audienceExtending a language by embedding within it another language presents significant parsing challenges, especially if the embedding is recursive. The composite grammar is likely to be nondeterministic as a result of tokens that are valid in both the host and the embedded language. In this paper we examine the challenges of embedding the Tom language into a variety of general-purpose high level languages. Tom provides syntax and semantics for advanced pattern matching and tree rewriting facilities. Embedded Tom constructs are translated into the host language by a preprocessor, the output of which is a composite program written purely in the host language. Tom implementations exist for Java, C, C#, Python and Caml. The current parser is complex and difficult to maintain. In this paper, we describe how Tom can be parsed using island grammars implemented with the Generalised LL (GLL) parsing algorithm. The grammar is, as might be expected, ambiguous. Extracting the correct derivation relies on our disambiguation strategy which is based on pattern matching within the parse forest. We describe different classes of ambiguity and propose patterns for resolving them

CiteSeerX

INRIA a CCSD electronic archive server

Resource Allocation Planning Helper (RALPH): Lessons learned

Author: Durham Ralph
Reilly Norman B.
Springer Joe B.
Publication venue
Publication date
Field of study

The current task of Resource Allocation Process includes the planning and apportionment of JPL's Ground Data System composed of the Deep Space Network and Mission Control and Computing Center facilities. The addition of the data driven, rule based planning system, RALPH, has expanded the planning horizon from 8 weeks to 10 years and has resulted in large labor savings. Use of the system has also resulted in important improvements in science return through enhanced resource utilization. In addition, RALPH has been instrumental in supporting rapid turn around for an increased volume of special what if studies. The status of RALPH is briefly reviewed and important lessons learned from the creation of an highly functional design team are focused on through an evolutionary design and implementation period in which an AI shell was selected, prototyped, and ultimately abandoned, and through the fundamental changes to the very process that spawned the tool kit. Principal topics include proper integration of software tools within the planning environment, transition from prototype to delivered to delivered software, changes in the planning methodology as a result of evolving software capabilities and creation of the ability to develop and process generic requirements to allow planning flexibility

NASA Technical Reports Server

Executable Formal Specification of Programming Languages with Reusable Components

Author: van Binsbergen L. Thomas
Publication venue
Publication date: 01/01/2019
Field of study

CWI's Institutional Repository

Royal Holloway - Pure

Towards Zero-Overhead Disambiguation of Deep Priority Conflicts

Author: Amorim Luís Eduardo de Souza
Steindorfer Michael J.
Visser Eelco
Publication venue: 'Aspect-Oriented Software Association (AOSA)'
Publication date: 27/03/2018
Field of study

**Context** Context-free grammars are widely used for language prototyping and implementation. They allow formalizing the syntax of domain-specific or general-purpose programming languages concisely and declaratively. However, the natural and concise way of writing a context-free grammar is often ambiguous. Therefore, grammar formalisms support extensions in the form of *declarative disambiguation rules* to specify operator precedence and associativity, solving ambiguities that are caused by the subset of the grammar that corresponds to expressions. **Inquiry** Implementing support for declarative disambiguation within a parser typically comes with one or more of the following limitations in practice: a lack of parsing performance, or a lack of modularity (i.e., disallowing the composition of grammar fragments of potentially different languages). The latter subject is generally addressed by scannerless generalized parsers. We aim to equip scannerless generalized parsers with novel disambiguation methods that are inherently performant, without compromising the concerns of modularity and language composition. **Approach** In this paper, we present a novel low-overhead implementation technique for disambiguating deep associativity and priority conflicts in scannerless generalized parsers with lightweight data-dependency. **Knowledge** Ambiguities with respect to operator precedence and associativity arise from combining the various operators of a language. While *shallow conflicts* can be resolved efficiently by one-level tree patterns, *deep conflicts* require more elaborate techniques, because they can occur arbitrarily nested in a tree. Current state-of-the-art approaches to solving deep priority conflicts come with a severe performance overhead. **Grounding** We evaluated our new approach against state-of-the-art declarative disambiguation mechanisms. By parsing a corpus of popular open-source repositories written in Java and OCaml, we found that our approach yields speedups of up to 1.73x over a grammar rewriting technique when parsing programs with deep priority conflicts--with a modest overhead of 1-2 % when parsing programs without deep conflicts. **Importance** A recent empirical study shows that deep priority conflicts are indeed wide-spread in real-world programs. The study shows that in a corpus of popular OCaml projects on Github, up to 17 % of the source files contain deep priority conflicts. However, there is no solution in the literature that addresses efficient disambiguation of deep priority conflicts, with support for modular and composable syntax definitions

arXiv.org e-Print Archive

Reusable components of semantic specifications

Author: Martin Churchill
Paolo Torrini
Peter D. Mosses
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Crossref

Reusable Components of Semantic Specifications

Author: A Afroozeh
A Johnstone
A Johnstone
BC Pierce
C Bach Poulsen
CB Poulsen
E Börger
E Börger
E Visser
G Kahn
G Roşu
GD Plotkin
J Iversen
J McCarthy
J Meseguer
JA Goguen
KG Doh
KG Doh
M Churchill
M Felleisen
M Felleisen
M Tofte
MGJ Brand den van
MGJ Brand van den
MY Levin
P Sewell
PD Mosses
PD Mosses
PD Mosses
PD Mosses
PD Mosses
PD Mosses
PD Mosses
PW Kutter
R Harper
R Milner
S Owens
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Semantic specifications of programming languages typically have poor modularity. This hinders reuse of parts of the semantics of one language when specifying a different language – even when the two languages have many constructs in common – and evolution of a language may require major reformulation of its semantics. Such drawbacks have discouraged language developers from using formal semantics to document their designs. In the PLanCompS project, we have developed a component-based approach to semantics. Here, we explain its modularity aspects, and present an illustrative case study: a component-based semantics for Caml Light. We have tested the correctness of the semantics by running programs on an interpreter generated from the semantics, comparing the output with that produced on the standard implementation of the language. Our approach provides good modularity, facilitates reuse, and should support co-evolution of languages and their formal semantics. It could be particularly useful in connection with domain-specific languages and language-driven software development

Crossref

Nottingham Trent Institutional Repository (IRep)

Cronfa at Swansea University