393 research outputs found

    Context in Parsing: Techniques and Applications

    Get PDF

    Practical Dynamic Grammars for Dynamic Languages

    No full text
    International audienceGrammars for programming languages are traditionally specified statically. They are hard to compose and reuse due to ambiguities that inevitably arise. PetitParser combines ideas from scannerless parsing, parser combinators, parsing expression grammars and packrat parsers to model grammars and parsers as objects that can be reconfigured dynamically. Through examples and benchmarks we demonstrate that dynamic grammars are not only flexible but highly practical

    Security Applications of Formal Language Theory

    Get PDF
    We present an approach to improving the security of complex, composed systems based on formal language theory, and show how this approach leads to advances in input validation, security modeling, attack surface reduction, and ultimately, software design and programming methodology. We cite examples based on real-world security flaws in common protocols representing different classes of protocol complexity. We also introduce a formalization of an exploit development technique, the parse tree differential attack, made possible by our conception of the role of formal grammars in security. These insights make possible future advances in software auditing techniques applicable to static and dynamic binary analysis, fuzzing, and general reverse-engineering and exploit development. Our work provides a foundation for verifying critical implementation components with considerably less burden to developers than is offered by the current state of the art. It additionally offers a rich basis for further exploration in the areas of offensive analysis and, conversely, automated defense tools and techniques. This report is divided into two parts. In Part I we address the formalisms and their applications; in Part II we discuss the general implications and recommendations for protocol and software design that follow from our formal analysis

    Object Grammars: Compositional & Bidirectional Mapping Between Text and Graphs

    Get PDF
    Abstract: Object Grammars define mappings between text and object graphs. Parsing recognizes syntactic features and creates the corresponding object structure. In the reverse direction, formatting recognizes object graph features and generates an appropriate textual presentation. The key to Object Grammars is the expressive power of the mapping, which decouples the syntactic structure from the graph structure. To handle graphs, Object Grammars support declarative annotations for resolving textual names that refer to arbitrary objects in the graph structure. Predicates on the semantic structure provide additional control over the mapping. Furthermore, Object Grammars are compositional so that languages may be defined in a modular fashion. We have implemented our approach to Object Grammars as one of the foundations of the Ensō system and illustrate the utility of our approach by showing how it enables definition and composition of domain-specific languages (DSLs)

    LL(1) Parsing with Derivatives and Zippers

    Full text link
    In this paper, we present an efficient, functional, and formally verified parsing algorithm for LL(1) context-free expressions based on the concept of derivatives of formal languages. Parsing with derivatives is an elegant parsing technique, which, in the general case, suffers from cubic worst-case time complexity and slow performance in practice. We specialise the parsing with derivatives algorithm to LL(1) context-free expressions, where alternatives can be chosen given a single token of lookahead. We formalise the notion of LL(1) expressions and show how to efficiently check the LL(1) property. Next, we present a novel linear-time parsing with derivatives algorithm for LL(1) expressions operating on a zipper-inspired data structure. We prove the algorithm correct in Coq and present an implementation as a parser combinators framework in Scala, with enumeration and pretty printing capabilities.Comment: Appeared at PLDI'20 under the title "Zippy LL(1) Parsing with Derivatives

    Modular grammar specification

    Get PDF
    We establish a semantics for building grammars from a modularised specification in which modules are able to delete productions from imported nonterminals. Modules have import lists of nonterminals; some or all of an imported nonterminal's productions may be suppressed at import time. There are two basic import mechanisms which (a) reference or (b) clone an imported nonterminal's productions. One of our goals is to allow a precise answer to the question: ‘what character level language does this grammar generate’ in the face of difficult issues such as the mutual embedding of languages that have different whitespace and commenting conventions. Our technique is to automatically generate a character level grammar from grammars written at token level in the conventional way; the grammar is constructed from modules each of which may have its own whitespace convention. Keywords: Context free grammar; Modularity; Whitespace processin

    XPL: a language for modular homogeneous language embedding

    Get PDF
    Languages that are used for Software Language Engineering (SLE) offer a range of features that support the construction and deployment of new languages. SLE languages offer features for constructing and processing syntax and defining the semantics of language features. New languages may be embedded within an existing language (internal) or may be stand-alone (external). Modularity is a desirable SLE property for which there is no generally agreed approach. This article analyses the current tools for SLE and identifies the key features that are common. It then proposes a language called XPL that supports these features. XPL is higher-order and allows languages to be constructed and manipulated as first-class elements and therefore can be used to represent a range of approaches to modular language definition. This is validated by using XPL to define the notion of a language module that supports modular language construction and language transformation

    XPL:A language for modular homogeneous language embedding

    Get PDF
    Languages that are used for Software Language Engineering (SLE) offer a range of features that support the construction and deployment of new languages. SLE languages offer features for constructing and processing syntax and defining the semantics of language features. New languages may be embedded within an existing language (internal) or may be stand-alone (external). Modularity is a desirable SLE property for which there is no generally agreed approach. This article analyses the current tools for SLE and identifies the key features that are common. It then proposes a language called XPL that supports these features. XPL is higher-order and allows languages to be constructed and manipulated as first-class elements and therefore can be used to represent a range of approaches to modular language definition. This is validated by using XPL to define the notion of a language module that supports modular language construction and language transformation
    corecore