34 research outputs found

    It Is NL-complete to Decide Whether a Hairpin Completion of Regular Languages Is Regular

    Full text link
    The hairpin completion is an operation on formal languages which is inspired by the hairpin formation in biochemistry. Hairpin formations occur naturally within DNA-computing. It has been known that the hairpin completion of a regular language is linear context-free, but not regular, in general. However, for some time it is was open whether the regularity of the hairpin completion of a regular language is is decidable. In 2009 this decidability problem has been solved positively by providing a polynomial time algorithm. In this paper we improve the complexity bound by showing that the decision problem is actually NL-complete. This complexity bound holds for both, the one-sided and the two-sided hairpin completions

    Two-Sided Derivatives for Regular Expressions and for Hairpin Expressions

    Full text link
    The aim of this paper is to design the polynomial construction of a finite recognizer for hairpin completions of regular languages. This is achieved by considering completions as new expression operators and by applying derivation techniques to the associated extended expressions called hairpin expressions. More precisely, we extend partial derivation of regular expressions to two-sided partial derivation of hairpin expressions and we show how to deduce a recognizer for a hairpin expression from its two-sided derived term automaton, providing an alternative proof of the fact that hairpin completions of regular languages are linear context-free.Comment: 28 page

    Formal models of the extension activity of DNA polymerase enzymes

    Get PDF
    The study of formal language operations inspired by enzymatic actions on DNA is part of ongoing efforts to provide a formal framework and rigorous treatment of DNA-based information and DNA-based computation. Other studies along these lines include theoretical explorations of splicing systems, insertion-deletion systems, substitution, hairpin extension, hairpin reduction, superposition, overlapping concatenation, conditional concatenation, contextual intra- and intermolecular recombinations, as well as template-guided recombination. First, a formal language operation is proposed and investigated, inspired by the naturally occurring phenomenon of DNA primer extension by a DNA-template-directed DNA polymerase enzyme. Given two DNA strings u and v, where the shorter string v (called the primer) is Watson-Crick complementary and can thus bind to a substring of the longer string u (called the template) the result of the primer extension is a DNA string that is complementary to a suffix of the template which starts at the binding position of the primer. The operation of DNA primer extension can be abstracted as a binary operation on two formal languages: a template language L1 and a primer language L2. This language operation is called L1-directed extension of L2 and the closure properties of various language classes, including the classes in the Chomsky hierarchy, are studied under directed extension. Furthermore, the question of finding necessary and sufficient conditions for a given language of target strings to be generated from a given template language when the primer language is unknown is answered. The canonic inverse of directed extension is used in order to obtain the optimal solution (the minimal primer language) to this question. The second research project investigates properties of the binary string and language operation overlap assembly as defined by Csuhaj-Varju, Petre and Vaszil as a formal model of the linear self-assembly of DNA strands: The overlap assembly of two strings, xy and yz, which share an overlap y, results in the string xyz. In this context, we investigate overlap assembly and its properties: closure properties of various language families under this operation, and related decision problems. A theoretical analysis of the possible use of iterated overlap assembly to generate combinatorial DNA libraries is also given. The third research project continues the exploration of the properties of the overlap assembly operation by investigating closure properties of various language classes under iterated overlap assembly, and the decidability of the completeness of a language. The problem of deciding whether a given string is terminal with respect to a language, and the problem of deciding if a given language can be generated by an overlap assembly operation of two other given languages are also investigated

    Deciding Regularity of Hairpin Completions of Regular Languages in Polynomial Time

    Get PDF
    The hairpin completion is an operation on formal languages that has been inspired by the hairpin formation in DNA biochemistry and by DNA computing. In this paper we investigate the hairpin completion of regular languages. It is well known that hairpin completions of regular languages are linear context-free and not necessarily regular. As regularity of a (linear) context-free language is not decidable, the question arose whether regularity of a hairpin completion of regular languages is decidable. We prove that this problem is decidable and we provide a polynomial time algorithm. Furthermore, we prove that the hairpin completion of regular languages is an unambiguous linear context-free language and, as such, it has an effectively computable growth function. Moreover, we show that the growth of the hairpin completion is exponential if and only if the growth of the underlying languages is exponential and, in case the hairpin completion is regular, then the hairpin completion and the underlying languages have the same growth indicator

    Word Blending and Other Formal Models of Bio-operations

    Get PDF
    As part of ongoing efforts to view biological processes as computations, several formal models of DNA-based processes have been proposed and studied in the formal language literature. In this thesis, we survey some classical formal language word and language operations, as well as several bio-operations, and we propose a new operation inspired by a DNA recombination lab protocol known as Cross-pairing Polymerase Chain Reaction, or XPCR. More precisely, we define and study a word operation called word blending which models a special case of XPCR, where two words x w p and q w y sharing a non-empty overlap part w generate the word x w y. Properties of word blending that we study include closure properties of the Chomsky families of languages under this operation and its iterated version, existence of solution to equations involving this operation, and its state complexity

    Dagstuhl Reports : Volume 1, Issue 2, February 2011

    Get PDF
    Online Privacy: Towards Informational Self-Determination on the Internet (Dagstuhl Perspectives Workshop 11061) : Simone Fischer-Hübner, Chris Hoofnagle, Kai Rannenberg, Michael Waidner, Ioannis Krontiris and Michael Marhöfer Self-Repairing Programs (Dagstuhl Seminar 11062) : Mauro Pezzé, Martin C. Rinard, Westley Weimer and Andreas Zeller Theory and Applications of Graph Searching Problems (Dagstuhl Seminar 11071) : Fedor V. Fomin, Pierre Fraigniaud, Stephan Kreutzer and Dimitrios M. Thilikos Combinatorial and Algorithmic Aspects of Sequence Processing (Dagstuhl Seminar 11081) : Maxime Crochemore, Lila Kari, Mehryar Mohri and Dirk Nowotka Packing and Scheduling Algorithms for Information and Communication Services (Dagstuhl Seminar 11091) Klaus Jansen, Claire Mathieu, Hadas Shachnai and Neal E. Youn

    DNA Computing: Modelling in Formal Languages and Combinatorics on Words, and Complexity Estimation

    Get PDF
    DNA computing, an essential area of unconventional computing research, encodes problems using DNA molecules and solves them using biological processes. This thesis contributes to the theoretical research in DNA computing by modelling biological processes as computations and by studying formal language and combinatorics on words concepts motivated by DNA processes. It also contributes to the experimental research in DNA computing by a scaling comparison between DNA computing and other models of computation. First, for theoretical DNA computing research, we propose a new word operation inspired by a DNA wet lab protocol called cross-pairing polymerase chain reaction (XPCR). We define and study a word operation called word blending that models and generalizes an unexpected outcome of XPCR. The input words are uwx and ywv that share a non-empty overlap w, and the output is the word uwv. Closure properties of the Chomsky families of languages under this operation and its iterated version, the existence of a solution to equations involving this operation, and its state complexity are studied. To follow the XPCR experimental requirement closely, a new word operation called conjugate word blending is defined, where the subwords x and y are required to be identical. Closure properties of the Chomsky families of languages under this operation and the XPCR experiments that motivate and implement it are presented. Second, we generalize the sequence of Fibonacci words inspired by biological concepts on DNA. The sequence of Fibonacci words is an infinite sequence of words obtained from two initial letters f(1) = a and f(2)= b, by the recursive definition f(n+2) = f(n+1)*f(n), for all positive integers n, where * denotes word concatenation. After we propose a unified terminology for different types of Fibonacci words and corresponding results in the extensive literature on the topic, we define and explore involutive Fibonacci words motivated by ideas stemming from theoretical studies of DNA computing. The relationship between different involutive Fibonacci words and their borderedness and primitivity are studied. Third, we analyze the practicability of DNA computing experiments since DNA computing and other unconventional computing methods that solve computationally challenging problems often have the limitation that the space of potential solutions grows exponentially with their sizes. For such problems, DNA computing algorithms may achieve a linear time complexity with an exponential space complexity as a trade-off. Using the subset sum problem as the benchmark problem, we present a scaling comparison of the DNA computing (DNA-C) approach with the network biocomputing (NB-C) and the electronic computing (E-C) approaches, where the volume, computing time, and energy required, relative to the input size, are compared. Our analysis shows that E-C uses a tiny volume compared to that required by DNA-C and NB-C, at the cost of the E-C computing time being outperformed first by DNA-C and then by NB-C. In addition, NB-C appears to be more energy efficient than DNA-C for some input sets, and E-C is always an order of magnitude less energy efficient than DNA-C

    Algorithmic methods for analysis of biochemical structures

    Get PDF
    Elektroniskā versija nesatur pielikumusDotās disertācijas ietvaros ir aprakstīti jauni ieguldījumi datorzinātnē, kas pavirza bioinformātiku dažus soļus tuvāk vispārejai proteīnu evolūcijas modeļa izveidošanai. VAdoties no pieņēmuma, ka proteīnu struktūras evolucionē pakāpeniski, kad katrā solī ir notikusi neliela struktūras izmaiņa, jauns algoritms ar nosaukumu ESSM tiek izstrādāts, lai salīdzinātu divas proteīnu struktūras un noteiktas evolūcijas izmaiņas, kas varētu starp viņiem notikt. Nākamais ieguldījums ir jauna metode, kas ļauj pētīt evolūcijas sakarības starp proteīnu struktūrām veselā proteīnu datubāzē. Dotā metode sastāv no divām daļām: proteīnu struktūru salīdzināšana, izmantojot ESSM algoritmu un proteīnu struktūru telpas grafu konstruēšana. Pēdējais ieguldījums ir IsoCleft algoritma realizācija kopā ar izstrādātiem piesaistīšanas centru modeļiem un lokālo līdzību novērtēšanas metodēm.This dissertation describe a number of novel contributions that move bioinformatics on a few staps toward creation of the general model of protein evolution. Under assumption that protein structures have evolved by a stepwise process, the ESSM algorithm has been developed for protein structure comperison and detection of evolutionary changes. The next contribution is a new method for the exploration of evolutionary relations between protein structures in the whole dataset of proteins. This method consists of two stages: all-agaist-all comparison of the protein structures by using the ESSM and construction of fold space hraph on the basis of discovered mutations. The last contribution is the implementation of IsoCleft algorithm for the detection of 3D atomic similarities by using created binding site models and similarity measurement scores
    corecore