2,953 research outputs found

    Left Recursion in Parsing Expression Grammars

    Full text link
    Parsing Expression Grammars (PEGs) are a formalism that can describe all deterministic context-free languages through a set of rules that specify a top-down parser for some language. PEGs are easy to use, and there are efficient implementations of PEG libraries in several programming languages. A frequently missed feature of PEGs is left recursion, which is commonly used in Context-Free Grammars (CFGs) to encode left-associative operations. We present a simple conservative extension to the semantics of PEGs that gives useful meaning to direct and indirect left-recursive rules, and show that our extensions make it easy to express left-recursive idioms from CFGs in PEGs, with similar results. We prove the conservativeness of these extensions, and also prove that they work with any left-recursive PEG. PEGs can also be compiled to programs in a low-level parsing machine. We present an extension to the semantics of the operations of this parsing machine that let it interpret left-recursive PEGs, and prove that this extension is correct with regards to our semantics for left-recursive PEGs.Comment: Extended version of the paper "Left Recursion in Parsing Expression Grammars", that was published on 2012 Brazilian Symposium on Programming Language

    Parsing Expression Grammars Made Practical

    Full text link
    Parsing Expression Grammars (PEGs) define languages by specifying recursive-descent parser that recognises them. The PEG formalism exhibits desirable properties, such as closure under composition, built-in disambiguation, unification of syntactic and lexical concerns, and closely matching programmer intuition. Unfortunately, state of the art PEG parsers struggle with left-recursive grammar rules, which are not supported by the original definition of the formalism and can lead to infinite recursion under naive implementations. Likewise, support for associativity and explicit precedence is spotty. To remedy these issues, we introduce Autumn, a general purpose PEG library that supports left-recursion, left and right associativity and precedence rules, and does so efficiently. Furthermore, we identify infix and postfix operators as a major source of inefficiency in left-recursive PEG parsers and show how to tackle this problem. We also explore the extensibility of the PEG paradigm by showing how one can easily introduce new parsing operators and how our parser accommodates custom memoization and error handling strategies. We compare our parser to both state of the art and battle-tested PEG and CFG parsers, such as Rats!, Parboiled and ANTLR.Comment: "Proceedings of the International Conference on Software Language Engineering (SLE 2015)" - 167-172 (ISBN : 978-1-4503-3686-4

    Pattern matching in compilers

    Get PDF
    In this thesis we develop tools for effective and flexible pattern matching. We introduce a new pattern matching system called amethyst. Amethyst is not only a generator of parsers of programming languages, but can also serve as an alternative to tools for matching regular expressions. Our framework also produces dynamic parsers. Its intended use is in the context of IDE (accurate syntax highlighting and error detection on the fly). Amethyst offers pattern matching of general data structures. This makes it a useful tool for implementing compiler optimizations such as constant folding, instruction scheduling, and dataflow analysis in general. The parsers produced are essentially top-down parsers. Linear time complexity is obtained by introducing the novel notion of structured grammars and regularized regular expressions. Amethyst uses techniques known from compiler optimizations to produce effective parsers.Comment: master thesi

    TRX: A Formally Verified Parser Interpreter

    Full text link
    Parsing is an important problem in computer science and yet surprisingly little attention has been devoted to its formal verification. In this paper, we present TRX: a parser interpreter formally developed in the proof assistant Coq, capable of producing formally correct parsers. We are using parsing expression grammars (PEGs), a formalism essentially representing recursive descent parsing, which we consider an attractive alternative to context-free grammars (CFGs). From this formalization we can extract a parser for an arbitrary PEG grammar with the warranty of total correctness, i.e., the resulting parser is terminating and correct with respect to its grammar and the semantics of PEGs; both properties formally proven in Coq.Comment: 26 pages, LMC
    • …
    corecore