630 research outputs found

    Left Recursion in Parsing Expression Grammars

    Full text link
    Parsing Expression Grammars (PEGs) are a formalism that can describe all deterministic context-free languages through a set of rules that specify a top-down parser for some language. PEGs are easy to use, and there are efficient implementations of PEG libraries in several programming languages. A frequently missed feature of PEGs is left recursion, which is commonly used in Context-Free Grammars (CFGs) to encode left-associative operations. We present a simple conservative extension to the semantics of PEGs that gives useful meaning to direct and indirect left-recursive rules, and show that our extensions make it easy to express left-recursive idioms from CFGs in PEGs, with similar results. We prove the conservativeness of these extensions, and also prove that they work with any left-recursive PEG. PEGs can also be compiled to programs in a low-level parsing machine. We present an extension to the semantics of the operations of this parsing machine that let it interpret left-recursive PEGs, and prove that this extension is correct with regards to our semantics for left-recursive PEGs.Comment: Extended version of the paper "Left Recursion in Parsing Expression Grammars", that was published on 2012 Brazilian Symposium on Programming Language

    On the Relation between Context-Free Grammars and Parsing Expression Grammars

    Full text link
    Context-Free Grammars (CFGs) and Parsing Expression Grammars (PEGs) have several similarities and a few differences in both their syntax and semantics, but they are usually presented through formalisms that hinder a proper comparison. In this paper we present a new formalism for CFGs that highlights the similarities and differences between them. The new formalism borrows from PEGs the use of parsing expressions and the recognition-based semantics. We show how one way of removing non-determinism from this formalism yields a formalism with the semantics of PEGs. We also prove, based on these new formalisms, how LL(1) grammars define the same language whether interpreted as CFGs or as PEGs, and also show how strong-LL(k), right-linear, and LL-regular grammars have simple language-preserving translations from CFGs to PEGs

    TRX: A Formally Verified Parser Interpreter

    Full text link
    Parsing is an important problem in computer science and yet surprisingly little attention has been devoted to its formal verification. In this paper, we present TRX: a parser interpreter formally developed in the proof assistant Coq, capable of producing formally correct parsers. We are using parsing expression grammars (PEGs), a formalism essentially representing recursive descent parsing, which we consider an attractive alternative to context-free grammars (CFGs). From this formalization we can extract a parser for an arbitrary PEG grammar with the warranty of total correctness, i.e., the resulting parser is terminating and correct with respect to its grammar and the semantics of PEGs; both properties formally proven in Coq.Comment: 26 pages, LMC

    Parsing Expression Grammars Made Practical

    Full text link
    Parsing Expression Grammars (PEGs) define languages by specifying recursive-descent parser that recognises them. The PEG formalism exhibits desirable properties, such as closure under composition, built-in disambiguation, unification of syntactic and lexical concerns, and closely matching programmer intuition. Unfortunately, state of the art PEG parsers struggle with left-recursive grammar rules, which are not supported by the original definition of the formalism and can lead to infinite recursion under naive implementations. Likewise, support for associativity and explicit precedence is spotty. To remedy these issues, we introduce Autumn, a general purpose PEG library that supports left-recursion, left and right associativity and precedence rules, and does so efficiently. Furthermore, we identify infix and postfix operators as a major source of inefficiency in left-recursive PEG parsers and show how to tackle this problem. We also explore the extensibility of the PEG paradigm by showing how one can easily introduce new parsing operators and how our parser accommodates custom memoization and error handling strategies. We compare our parser to both state of the art and battle-tested PEG and CFG parsers, such as Rats!, Parboiled and ANTLR.Comment: "Proceedings of the International Conference on Software Language Engineering (SLE 2015)" - 167-172 (ISBN : 978-1-4503-3686-4

    Capturing CFLs with Tree Adjoining Grammars

    Full text link
    We define a decidable class of TAGs that is strongly equivalent to CFGs and is cubic-time parsable. This class serves to lexicalize CFGs in the same manner as the LCFGs of Schabes and Waters but with considerably less restriction on the form of the grammars. The class provides a normal form for TAGs that generate local sets in much the same way that regular grammars provide a normal form for CFGs that generate regular sets.Comment: 8 pages, 3 figures. To appear in proceedings of ACL'9

    Graph Interpolation Grammars as Context-Free Automata

    Get PDF
    A derivation step in a Graph Interpolation Grammar has the effect of scanning an input token. This feature, which aims at emulating the incrementality of the natural parser, restricts the formal power of GIGs. This contrasts with the fact that the derivation mechanism involves a context-sensitive device similar to tree adjunction in TAGs. The combined effect of input-driven derivation and restricted context-sensitiveness would be conceivably unfortunate if it turned out that Graph Interpolation Languages did not subsume Context Free Languages while being partially context-sensitive. This report sets about examining relations between CFGs and GIGs, and shows that GILs are a proper superclass of CFLs. It also brings out a strong equivalence between CFGs and GIGs for the class of CFLs. Thus, it lays the basis for meaningfully investigating the amount of context-sensitiveness supported by GIGs, but leaves this investigation for further research

    Efficient Monitoring of Parametric Context Free Patterns

    Get PDF
    Recent developments in runtime verification and monitoring show that parametric regular and temporal logic specifications can be efficiently monitored against large programs. However, these logics reduce to ordinary finite automata, limiting their expressivity. For example, neither can specify structured properties that refer to the call stack of the program. While context-free grammars (CFGs) are expressive and well-understood, existing techniques of monitoring CFGs generate massive runtime overhead in real-life applications. This paper shows for the first time that monitoring parametric CFGs is practical (on the order of 10% or lower for average cases, several times faster than the state-of-the-art). We present a monitor synthesis algorithm for CFGs based on an LR(1) parsing algorithm, modified with stack cloning to account for good prefix matching. In addition, a logic-independent mechanism is introduced to support partial matching, allowing patterns to be checked against fragments of execution traces
    • …
    corecore