81 research outputs found

    On insertion-deletion systems over relational words

    Full text link
    We introduce a new notion of a relational word as a finite totally ordered set of positions endowed with three binary relations that describe which positions are labeled by equal data, by unequal data and those having an undefined relation between their labels. We define the operations of insertion and deletion on relational words generalizing corresponding operations on strings. We prove that the transitive and reflexive closure of these operations has a decidable membership problem for the case of short insertion-deletion rules (of size two/three and three/two). At the same time, we show that in the general case such systems can produce a coding of any recursively enumerable language leading to undecidabilty of reachability questions.Comment: 24 pages, 8 figure

    Circular Languages Generated by Complete Splicing Systems and Pure Unitary Languages

    Full text link
    Circular splicing systems are a formal model of a generative mechanism of circular words, inspired by a recombinant behaviour of circular DNA. Some unanswered questions are related to the computational power of such systems, and finding a characterization of the class of circular languages generated by circular splicing systems is still an open problem. In this paper we solve this problem for complete systems, which are special finite circular splicing systems. We show that a circular language L is generated by a complete system if and only if the set Lin(L) of all words corresponding to L is a pure unitary language generated by a set closed under the conjugacy relation. The class of pure unitary languages was introduced by A. Ehrenfeucht, D. Haussler, G. Rozenberg in 1983, as a subclass of the class of context-free languages, together with a characterization of regular pure unitary languages by means of a decidable property. As a direct consequence, we characterize (regular) circular languages generated by complete systems. We can also decide whether the language generated by a complete system is regular. Finally, we point out that complete systems have the same computational power as finite simple systems, an easy type of circular splicing system defined in the literature from the very beginning, when only one rule is allowed. From our results on complete systems, it follows that finite simple systems generate a class of context-free languages containing non-regular languages, showing the incorrectness of a longstanding result on simple systems

    Maximality and Applications of Subword-Closed Languages

    Get PDF
    Characterizing languages D that are maximal with the property that D* ⊆ S⊗ is an important problem in formal language theory with applications to coding theory and DNA codewords. Given a finite set of words of a fixed length S, the constraint, we consider its subword closure, S⊗, the set of words whose subwords of that fixed length are all in the constraint. We investigate these maximal languages and present characterizations for them. These characterizations use strongly connected components of deterministic finite automata and lead to polynomial time algorithms for generating such languages. We prove that the subword closure S⊗ is strictly locally testable. Finally, we discuss applications to coding theory and encoding arbitrary blocks of information on DNA strands. This leads to very important applications in DNA codewords designed to obtain bond-free languages, which have been experimentally confirmed

    A DNA Codification for Genetic Algorithms Simulation

    Get PDF
    In this paper we propose a model of encoding data into DNA strands so that this data can be used in the simulation of a genetic algorithm based on molecular operations. DNA computing is an impressive computational model that needs algorithms to work properly and efficiently. The first problem when trying to apply an algorithm in DNA computing must be how to codify the data that the algorithm will use. In a genetic algorithm the first objective must be to codify the genes, which are the main data. A concrete encoding of the genes in a single DNA strand is presented and we discuss what this codification is suitable for. Previous work on DNA coding defined bond-free languages which several properties assuring the stability of any DNA word of such a language. We prove that a bond-free language is necessary but not sufficient to codify a gene giving the correct codification

    A Study of Pseudo-Periodic and Pseudo-Bordered Words for Functions Beyond Identity and Involution

    Get PDF
    Periodicity, primitivity and borderedness are some of the fundamental notions in combinatorics on words. Motivated by the Watson-Crick complementarity of DNA strands wherein a word (strand) over the DNA alphabet \{A, G, C, T\} and its Watson-Crick complement are informationally ``identical , these notions have been extended to consider pseudo-periodicity and pseudo-borderedness obtained by replacing the ``identity function with ``pseudo-identity functions (antimorphic involution in case of Watson-Crick complementarity). For a given alphabet Σ\Sigma, an antimorphic involution θ\theta is an antimorphism, i.e., θ(uv)=θ(v)θ(u)\theta(uv)=\theta(v) \theta(u) for all u,v∈Σ∗u,v \in \Sigma^{*} and an involution, i.e., θ(θ(u))=u\theta(\theta(u))=u for all u∈Σ∗u \in \Sigma^{*}. In this thesis, we continue the study of pseudo-periodic and pseudo-bordered words for pseudo-identity functions including involutions. To start with, we propose a binary word operation, θ\theta-catenation, that generates θ\theta-powers (pseudo-powers) of a word for any morphic or antimorphic involution θ\theta. We investigate various properties of this operation including closure properties of various classes of languages under it, and its connection with the previously defined notion of θ\theta-primitive words. A non-empty word uu is said to be θ\theta-bordered if there exists a non-empty word vv which is a prefix of uu while θ(v)\theta(v) is a suffix of uu. We investigate the properties of θ\theta-bordered (pseudo-bordered) and θ\theta-unbordered (pseudo-unbordered) words for pseudo-identity functions θ\theta with the property that θ\theta is either a morphism or an antimorphism with θn=I\theta^{n}=I, for a given n≥2n \geq 2, or θ\theta is a literal morphism or an antimorphism. Lastly, we initiate a new line of study by exploring the disjunctivity properties of sets of pseudo-bordered and pseudo-unbordered words and some other related languages for various pseudo-identity functions. In particular, we consider such properties for morphic involutions θ\theta and prove that, for any i≥2i \geq 2, the set of all words with exactly ii θ\theta-borders is disjunctive (under certain conditions)

    Encoding methods for DNA languages defined via the subword closure operation

    Get PDF
    ix, 124 leaves : ill. ; 29 cm.Includes abstract.Includes bibliographical references (leaves 118-124).In DNA computing, information is encoded onto DNA sequences. The DNA codes in the form of single-stranded DNA sequences are not stable. This is because when two single-stranded DNA sequences, used to carry data, have complement parts on them, they naturally tend to stick to each other. This is due to the Watson-Crick complementarity property and causes the problem of undesirable bonds. Some properties and constraints have been proposed to prevent the problem, but most of them are local constraints which concentrate on a segment of a DNA word of a certain length. Therefore, if we concatenate some DNA words satisfying some local constraints, the resulting words might violate the same constraints. This makes encoding methods for DNA languages difficult to design. To solve this problem, we investigate sonic properties of the subword closure operation that is used for constructing DNA languages and propose practical encoding methods for such languages. We also implement our methods using advanced C++ tools for finite automata as well as design a web interface that allows users to obtain a DNA language in response to given values for certain parameters

    DNA Hairpin Secondary Structure Design

    Get PDF
    In this thesis, we propose a bottom-up method to design single-stranded DNA sequences that form consecutive hairpin structures. This work was inspired by the hairpin-based DNA multi-state machine proposed by Takahashi et al. in 2004. They have successfully achieved this DNA multiple-hairpin structure in a laboratory experiment and proposed two possible applications. The first one is to construct a random access memory (RAM) by using the DNA machines as the access address for the data. The second one is to solve the maximum independent set problem (MISP). It is interesting thus to investigate how to design DNA sequences which form consecutive hairpin structures as mentioned above. We propose a bottom-up approach to construct consecutive hairpin structures, grounded on a so-called bond-free property, and several combinatorial constraints. A software is implemented to study the behavior of our bottom-up approach. We also calculate the maximal number of sequences that correctly fold into the desired multiple-hairpin structure. This calculation provides an estimation for the size of the memory that can be constructed using Takahashi et aVs method. Lastly, by selecting suitable parameters, we successfully construct a set of sequences that can fold in to the desirable multiple-hairpin structure. For example, our software is able to generate 120 sequences that can fold into a four-hairpin structure where the length of each hairpin stem is 20, the length of each hairpin loop is 7 and the external segment is 20. We validate these sequences using the molecule secondary structure prediction package, Vienna RNA secondary structure package

    Acta Cybernetica : Volume 19. Number 2.

    Get PDF
    • …
    corecore