1,931 research outputs found

    On Compensation Loops in Genomic Duplications

    Full text link
    Electronic version of an article published as International Journal of Foundations of Computer Science 2020 31:01, 133-142, DOI: 10.1142/S0129054120400092 © World Scientific Publishing Company https://www.worldscientific.com/worldscinet/ijfcs[EN] In this paper, we investigate the compensation loops, a DNA rearrangement in chromosomes due to unequal crossing over. We study the e fect of compensation loops over the gene duplication, and we formalize it as a restricted case of gene duplication in general. We study this biological process under the point of view of formal languages, and we provide some results about the languages de fined in this way.Sempere Luna, JM. (2020). On Compensation Loops in Genomic Duplications. International Journal of Foundations of Computer Science. 31(1):133-142. https://doi.org/10.1142/S0129054120400092S133142311Bovet, D. P., & Varricchio, S. (1992). On the regularity of languages on a binary alphabet generated by copying systems. Information Processing Letters, 44(3), 119-123. doi:10.1016/0020-0190(92)90050-6Dassow, J., Mitrana, V., & Salomaa, A. (1997). Context-free evolutionary grammars and the structural language of nucleic acids. Biosystems, 43(3), 169-177. doi:10.1016/s0303-2647(97)00036-1Ehrenfeucht, A., & Rozenberg, G. (1984). On regularity of languages generated by copying systems. Discrete Applied Mathematics, 8(3), 313-317. doi:10.1016/0166-218x(84)90129-xLeupold, P., Martín-Vide, C., & Mitrana, V. (2005). Uniformly bounded duplication languages. Discrete Applied Mathematics, 146(3), 301-310. doi:10.1016/j.dam.2004.10.003Leupold, P., & Mitrana, V. (2007). Uniformly bounded duplication codes. RAIRO - Theoretical Informatics and Applications, 41(4), 411-424. doi:10.1051/ita:2007021Leupold, P., Mitrana, V., & Sempere, J. M. (2003). Formal Languages Arising from Gene Repeated Duplication. Lecture Notes in Computer Science, 297-308. doi:10.1007/978-3-540-24635-0_22Rozenberg, G., & Salomaa, A. (Eds.). (1997). Handbook of Formal Languages. doi:10.1007/978-3-642-59126-

    Open Problems in the Emergence and Evolution of Linguistic Communication: A Road-Map for Research

    Get PDF

    Maintaining regularity and generalization in data using the minimum description length principle and genetic algorithm: case of grammatical inference

    Get PDF
    In this paper, a genetic algorithm with minimum description length (GAWMDL) is proposed for grammatical inference. The primary challenge of identifying a language of infinite cardinality from a finite set of examples should know when to generalize and specialize the training data. The minimum description length principle that has been incorporated addresses this issue is discussed in this paper. Previously, the e-GRIDS learning model was proposed, which enjoyed the merits of the minimum description length principle, but it is limited to positive examples only. The proposed GAWMDL, which incorporates a traditional genetic algorithm and has a powerful global exploration capability that can exploit an optimum offspring. This is an effective approach to handle a problem which has a large search space such the grammatical inference problem. The computational capability, the genetic algorithm poses is not questionable, but it still suffers from premature convergence mainly arising due to lack of population diversity. The proposed GAWMDL incorporates a bit mask oriented data structure that performs the reproduction operations, creating the mask, then Boolean based procedure is applied to create an offspring in a generative manner. The Boolean based procedure is capable of introducing diversity into the population, hence alleviating premature convergence. The proposed GAWMDL is applied in the context free as well as regular languages of varying complexities. The computational experiments show that the GAWMDL finds an optimal or close-to-optimal grammar. Two fold performance analysis have been performed. First, the GAWMDL has been evaluated against the elite mating pool genetic algorithm which was proposed to introduce diversity and to address premature convergence. GAWMDL is also tested against the improved tabular representation algorithm. In addition, the authors evaluate the performance of the GAWMDL against a genetic algorithm not using the minimum description length principle. Statistical tests demonstrate the superiority of the proposed algorithm. Overall, the proposed GAWMDL algorithm greatly improves the performance in three main aspects: maintains regularity of the data, alleviates premature convergence and is capable in grammatical inference from both positive and negative corpora

    Maintaining regularity and generalization in data using the minimum description length principle and genetic algorithm: Case of grammatical inference

    Get PDF
    In this paper, a genetic algorithm with minimum description length (GAWMDL) is proposed for grammatical inference. The primary challenge of identifying a language of infinite cardinality from a finite set of examples should know when to generalize and specialize the training data. The minimum description length principle that has been incorporated addresses this issue is discussed in this paper. Previously, the e-GRIDS learning model was proposed, which enjoyed the merits of the minimum description length principle, but it is limited to positive examples only. The proposed GAWMDL, which incorporates a traditional genetic algorithm and has a powerful global exploration capability that can exploit an optimum offspring. This is an effective approach to handle a problem which has a large search space such the grammatical inference problem. The computational capability, the genetic algorithm poses is not questionable, but it still suffers from premature convergence mainly arising due to lack of population diversity. The proposed GAWMDL incorporates a bit mask oriented data structure that performs the reproduction operations, creating the mask, then Boolean based procedure is applied to create an offspring in a generative manner. The Boolean based procedure is capable of introducing diversity into the population, hence alleviating premature convergence. The proposed GAWMDL is applied in the context free as well as regular languages of varying complexities. The computational experiments show that the GAWMDL finds an optimal or close-to-optimal grammar. Two fold performance analysis have been performed. First, the GAWMDL has been evaluated against the elite mating pool genetic algorithm which was proposed to introduce diversity and to address premature convergence. GAWMDL is also tested against the improved tabular representation algorithm. In addition, the authors evaluate the performance of the GAWMDL against a genetic algorithm not using the minimum description length principle. Statistical tests demonstrate the superiority of the proposed algorithm. Overall, the proposed GAWMDL algorithm greatly improves the performance in three main aspects: maintains regularity of the data, alleviates premature convergence and is capable in grammatical inference from both positive and negative corpora

    Maintaining regularity and generalization in data using the minimum description length principle and genetic algorithm: case of grammatical inference

    Get PDF
    In this paper, a genetic algorithm with minimum description length (GAWMDL) is proposed for grammatical inference. The primary challenge of identifying a language of infinite cardinality from a finite set of examples should know when to generalize and specialize the training data. The minimum description length principle that has been incorporated addresses this issue is discussed in this paper. Previously, the e-GRIDS learning model was proposed, which enjoyed the merits of the minimum description length principle, but it is limited to positive examples only. The proposed GAWMDL, which incorporates a traditional genetic algorithm and has a powerful global exploration capability that can exploit an optimum offspring. This is an effective approach to handle a problem which has a large search space such the grammatical inference problem. The computational capability, the genetic algorithm poses is not questionable, but it still suffers from premature convergence mainly arising due to lack of population diversity. The proposed GAWMDL incorporates a bit mask oriented data structure that performs the reproduction operations, creating the mask, then Boolean based procedure is applied to create an offspring in a generative manner. The Boolean based procedure is capable of introducing diversity into the population, hence alleviating premature convergence. The proposed GAWMDL is applied in the context free as well as regular languages of varying complexities. The computational experiments show that the GAWMDL finds an optimal or close-to-optimal grammar. Two fold performance analysis have been performed. First, the GAWMDL has been evaluated against the elite mating pool genetic algorithm which was proposed to introduce diversity and to address premature convergence. GAWMDL is also tested against the improved tabular representation algorithm. In addition, the authors evaluate the performance of the GAWMDL against a genetic algorithm not using the minimum description length principle. Statistical tests demonstrate the superiority of the proposed algorithm. Overall, the proposed GAWMDL algorithm greatly improves the performance in three main aspects: maintains regularity of the data, alleviates premature convergence and is capable in grammatical inference from both positive and negative corpora

    Iterated learning and grounding: from holistic to compositional languages

    Get PDF
    This paper presents a new computational model for studying the origins and evolution of compositional languages grounded through the interaction between agents and their environment. The model is based on previous work on adaptive grounding of lexicons and the iterated learning model. Although the model is still in a developmental phase, the first results show that a compositional language can emerge in which the structure reflects regularities present in the population's environment

    Grammatical evolution to design fractal curves with a given dimension

    Full text link
    Original paper in http://ieeexplore.ieee.org/Lindenmayer grammars have frequently been applied to represent fractal curves. In this work, the ideas behind grammar evolution are used to automatically generate and evolve Lindenmayer grammars which represent fractal curves with a fractal dimension that approximates a predefined required value. For many dimensions, this is a nontrivial task to be performed manually. The procedure we propose closely parallels biological evolution because it acts through three different levels: a genotype (a vector of integers), a protein-like intermediate level (the Lindenmayer grammar), and a phenotype (the fractal curve). Variation acts at the genotype level, while selection is performed at the phenotype level (by comparing the dimensions of the fractal curves to the desired value).This paper has been sponsored by the Spanish Ministry of Science and Technology (MCYT), project numbers TIC2002-01948 and TIC2001-0685-C02-01

    A competence-performance based model to develop a syntactic language for artificial agents

    Get PDF
    The hypothesis of language use is an attractive theory in order to explain how natural languages evolve and develop in social populations. In this paper we present a model partially based on the idea of language games, so that a group of artificial agents are able to produce and share a symbolic language with syntactic structure. Grammatical structure is induced by grammatical evolution of stochastic regular grammars with learning capabilities, while language development is refined by means of language games where the agents apply on-line probabilistic reinforcement learning. Within this framework, the model adapts the concepts of competence and performance in language, as they have been proposed in some linguistic theories. The first experiments in this article have been organized around the linguistic description of visual scenes with the possibility of changing the referential situations. A second and more complicated experimental setting is also analyzed, where linguistic descriptions are enforced to keep word order constraints.The second author has been supported by the Spanish Ministry of Science under contract ENE2014-56126-C2-2-R (AOPRIN-SOL)

    Duplication grammars

    Get PDF
    corecore