10,277 research outputs found

    On the Relation between Context-Free Grammars and Parsing Expression Grammars

    Full text link
    Context-Free Grammars (CFGs) and Parsing Expression Grammars (PEGs) have several similarities and a few differences in both their syntax and semantics, but they are usually presented through formalisms that hinder a proper comparison. In this paper we present a new formalism for CFGs that highlights the similarities and differences between them. The new formalism borrows from PEGs the use of parsing expressions and the recognition-based semantics. We show how one way of removing non-determinism from this formalism yields a formalism with the semantics of PEGs. We also prove, based on these new formalisms, how LL(1) grammars define the same language whether interpreted as CFGs or as PEGs, and also show how strong-LL(k), right-linear, and LL-regular grammars have simple language-preserving translations from CFGs to PEGs

    Colored operads, series on colored operads, and combinatorial generating systems

    Full text link
    We introduce bud generating systems, which are used for combinatorial generation. They specify sets of various kinds of combinatorial objects, called languages. They can emulate context-free grammars, regular tree grammars, and synchronous grammars, allowing us to work with all these generating systems in a unified way. The theory of bud generating systems uses colored operads. Indeed, an object is generated by a bud generating system if it satisfies a certain equation in a colored operad. To compute the generating series of the languages of bud generating systems, we introduce formal power series on colored operads and several operations on these. Series on colored operads are crucial to express the languages specified by bud generating systems and allow us to enumerate combinatorial objects with respect to some statistics. Some examples of bud generating systems are constructed; in particular to specify some sorts of balanced trees and to obtain recursive formulas enumerating these.Comment: 48 page

    Linear Parsing Expression Grammars

    Full text link
    PEGs were formalized by Ford in 2004, and have several pragmatic operators (such as ordered choice and unlimited lookahead) for better expressing modern programming language syntax. Since these operators are not explicitly defined in the classic formal language theory, it is significant and still challenging to argue PEGs' expressiveness in the context of formal language theory.Since PEGs are relatively new, there are several unsolved problems.One of the problems is revealing a subclass of PEGs that is equivalent to DFAs. This allows application of some techniques from the theory of regular grammar to PEGs. In this paper, we define Linear PEGs (LPEGs), a subclass of PEGs that is equivalent to DFAs. Surprisingly, LPEGs are formalized by only excluding some patterns of recursive nonterminal in PEGs, and include the full set of ordered choice, unlimited lookahead, and greedy repetition, which are characteristic of PEGs. Although the conversion judgement of parsing expressions into DFAs is undecidable in general, the formalism of LPEGs allows for a syntactical judgement of parsing expressions.Comment: Parsing expression grammars, Boolean finite automata, Packrat parsin

    Automated DNA Motif Discovery

    Get PDF
    Ensembl's human non-coding and protein coding genes are used to automatically find DNA pattern motifs. The Backus-Naur form (BNF) grammar for regular expressions (RE) is used by genetic programming to ensure the generated strings are legal. The evolved motif suggests the presence of Thymine followed by one or more Adenines etc. early in transcripts indicate a non-protein coding gene. Keywords: pseudogene, short and microRNAs, non-coding transcripts, systems biology, machine learning, Bioinformatics, motif, regular expression, strongly typed genetic programming, context-free grammar.Comment: 12 pages, 2 figure

    PonyGE2: Grammatical Evolution in Python

    Full text link
    Grammatical Evolution (GE) is a population-based evolutionary algorithm, where a formal grammar is used in the genotype to phenotype mapping process. PonyGE2 is an open source implementation of GE in Python, developed at UCD's Natural Computing Research and Applications group. It is intended as an advertisement and a starting-point for those new to GE, a reference for students and researchers, a rapid-prototyping medium for our own experiments, and a Python workout. As well as providing the characteristic genotype to phenotype mapping of GE, a search algorithm engine is also provided. A number of sample problems and tutorials on how to use and adapt PonyGE2 have been developed.Comment: 8 pages, 4 figures, submitted to the 2017 GECCO Workshop on Evolutionary Computation Software Systems (EvoSoft
    • …
    corecore