8 research outputs found

    On Compensation Loops in Genomic Duplications

    Full text link
    Electronic version of an article published as International Journal of Foundations of Computer Science 2020 31:01, 133-142, DOI: 10.1142/S0129054120400092 © World Scientific Publishing Company https://www.worldscientific.com/worldscinet/ijfcs[EN] In this paper, we investigate the compensation loops, a DNA rearrangement in chromosomes due to unequal crossing over. We study the e fect of compensation loops over the gene duplication, and we formalize it as a restricted case of gene duplication in general. We study this biological process under the point of view of formal languages, and we provide some results about the languages de fined in this way.Sempere Luna, JM. (2020). On Compensation Loops in Genomic Duplications. International Journal of Foundations of Computer Science. 31(1):133-142. https://doi.org/10.1142/S0129054120400092S133142311Bovet, D. P., & Varricchio, S. (1992). On the regularity of languages on a binary alphabet generated by copying systems. Information Processing Letters, 44(3), 119-123. doi:10.1016/0020-0190(92)90050-6Dassow, J., Mitrana, V., & Salomaa, A. (1997). Context-free evolutionary grammars and the structural language of nucleic acids. Biosystems, 43(3), 169-177. doi:10.1016/s0303-2647(97)00036-1Ehrenfeucht, A., & Rozenberg, G. (1984). On regularity of languages generated by copying systems. Discrete Applied Mathematics, 8(3), 313-317. doi:10.1016/0166-218x(84)90129-xLeupold, P., Martín-Vide, C., & Mitrana, V. (2005). Uniformly bounded duplication languages. Discrete Applied Mathematics, 146(3), 301-310. doi:10.1016/j.dam.2004.10.003Leupold, P., & Mitrana, V. (2007). Uniformly bounded duplication codes. RAIRO - Theoretical Informatics and Applications, 41(4), 411-424. doi:10.1051/ita:2007021Leupold, P., Mitrana, V., & Sempere, J. M. (2003). Formal Languages Arising from Gene Repeated Duplication. Lecture Notes in Computer Science, 297-308. doi:10.1007/978-3-540-24635-0_22Rozenberg, G., & Salomaa, A. (Eds.). (1997). Handbook of Formal Languages. doi:10.1007/978-3-642-59126-

    Duplication Closure of Regular Languages

    Get PDF
    In the present paper,we prove that the n-bounded duplication closure of a regular language is regular for n=1,2

    Well quasi-orders and context-free grammars

    Get PDF
    Let G be a context-free grammar and let L be the language of all the words derived from any variable of G. We prove the following generalization of Higman's theorem: any division order on L is a well quasi-order on L. We also give applications of this result to some quasi-orders associated with unitary grammars. (C) 2004 Elsevier B.V. All rights reserved

    Bounded prefix-suffix duplication

    Get PDF
    We consider a restricted variant of the prefix-suffix duplication operation, called bounded prefix-suffix duplication. It consists in the iterative duplication of a prefix or suffix, whose length is bounded by a constant, of a given word. We give a sufficient condition for the closure under bounded prefix-suffix duplication of a class of languages. Consequently, the class of regular languages is closed under bounded prefix-suffix duplication; furthermore, we propose an algorithm deciding whether a regular language is a finite k-prefix-suffix duplication language. An efficient algorithm solving the membership problem for the k-prefix-suffix duplication of a language is also presented. Finally, we define the k-prefix-suffix duplication distance between two words, extend it to languages and show how it can be computed for regular languages

    Prefix-suffix duplication

    Full text link
    We consider a bio-inspired formal operation on words called prefix-suffix duplication which consists in the duplication of a prefix or suffix of a given word. The class of languages defined by the iterated application of the prefix-suffix duplication to a word is considered. We show that such a language is context-free if and only if the initial word contains just one letter. Moreover, every language in this class is semilinear and belongs to NL. We propose a 0(n2 logn) time and 0(n2 ) space recognition algorithm. Two algorithms are further proposed for computing the prefix-suffix duplication distance between two words, defined as the minimal number of prefix-suffix duplications applied to one of them in order to get the other one. The first algorithm runs in cubic time and uses quadratic space while the second one is more efficient, having 0(n2 logn) time complexity, but needs 0(n2 logn) space

    Repetitive subwords

    Get PDF
    The central notionof thisthesisis repetitionsin words. We studyproblemsrelated to contiguous repetitions. More specifically we will consider repeating scattered subwords of non-primitive words, i.e. words which are complete repetitions of other words. We will present inequalities concerning these occurrences as well as giving apartial solutionto an openproblemposedby Salomaaet al. We will characterize languages, whichare closed under the operation ofduplication, thatis repeating any factor of a word. We alsogive newbounds onthe number of occurrencesof certain types of repetitions of words. We give a solution to an open problem posed by Calbrix and Nivat concerning regular languages consisting of non-primitive words. We alsopresentsomeresultsregarding theduplication closureoflanguages,among which a new proof to a problem of Bovet and Varricchio

    Languages Generated by Iterated Idempotencies.

    Get PDF
    The rewrite relation with parameters m and n and with the possible length limit = k or :::; k we denote by w~, =kW~· or ::;kw~ respectively. The idempotency languages generated from a starting word w by the respective operations are wDAlso other special cases of idempotency languages besides duplication have come up in different contexts. The investigations of Ito et al. about insertion and deletion, Le., operations that are also observed in DNA molecules, have established that w5 and w~ both preserve regularity.Our investigations about idempotency relations and languages start out from the case of a uniform length bound. For these relations =kW~ the conditions for confluence are characterized completely. Also the question of regularity is -k n answered for aH the languages w- D 1 are more complicated and belong to the class of context-free languages.For a generallength bound, i.e."for the relations :"::kW~, confluence does not hold so frequently. This complicatedness of the relations results also in more complicated languages, which are often non-regular, as for example the languages WWithout any length bound, idempotency relations have a very complicated structure. Over alphabets of one or two letters we still characterize the conditions for confluence. Over three or more letters, in contrast, only a few cases are solved. We determine the combinations of parameters that result in the regularity of wDIn a second chapter sorne more involved questions are solved for the special case of duplication. First we shed sorne light on the reasons why it is so difficult to determine the context-freeness ofduplication languages. We show that they fulfiH aH pumping properties and that they are very dense. Therefore aH the standard tools to prove non-context-freness do not apply here.The concept of root in Formal Language ·Theory is frequently used to describe the reduction of a word to another one, which is in sorne sense elementary.For example, there are primitive roots, periodicity roots, etc. Elementary in connection with duplication are square-free words, Le., words that do not contain any repetition. Thus we define the duplication root of w to consist of aH the square-free words, from which w can be reached via the relation w~.Besides sorne general observations we prove the decidability of the question, whether the duplication root of a language is finite.Then we devise acode, which is robust under duplication of its code words.This would keep the result of a computation from being destroyed by dupli cations in the code words. We determine the exact conditions, under which infinite such codes exist: over an alphabet of two letters they exist for a length bound of 2, over three letters already for a length bound of 1.Also we apply duplication to entire languages rather than to single words; then it is interesting to determine, whether regular and context-free languages are closed under this operation. We show that the regular languages are closed under uniformly bounded duplication, while they are not closed under duplication with a generallength bound. The context-free languages are closed under both operations.The thesis concludes with a list of open problems related with the thesis' topics
    corecore