211 research outputs found

    Reducibility of Gene Patterns in Ciliates using the Breakpoint Graph

    Full text link
    Gene assembly in ciliates is one of the most involved DNA processings going on in any organism. This process transforms one nucleus (the micronucleus) into another functionally different nucleus (the macronucleus). We continue the development of the theoretical models of gene assembly, and in particular we demonstrate the use of the concept of the breakpoint graph, known from another branch of DNA transformation research. More specifically: (1) we characterize the intermediate gene patterns that can occur during the transformation of a given micronuclear gene pattern to its macronuclear form; (2) we determine the number of applications of the loop recombination operation (the most basic of the three molecular operations that accomplish gene assembly) needed in this transformation; (3) we generalize previous results (and give elegant alternatives for some proofs) concerning characterizations of the micronuclear gene patterns that can be assembled using a specific subset of the three molecular operations.Comment: 30 pages, 13 figure

    Strategies of Loop Recombination in Ciliates

    Get PDF
    Gene assembly in ciliates is an extremely involved DNA transformation process, which transforms a nucleus, the micronucleus, to another functionally different nucleus, the macronucleus. In this paper we characterize which loop recombination operations (one of the three types of molecular operations that accomplish gene assembly) can possibly be applied in the transformation of a given gene from its micronuclear form to its macronuclear form. We also characterize in which order these loop recombination operations are applicable. This is done in the abstract and more general setting of so-called legal strings.Comment: 22 pages, 14 figure

    Maximal Pivots on Graphs with an Application to Gene Assembly

    Get PDF
    We consider principal pivot transform (pivot) on graphs. We define a natural variant of this operation, called dual pivot, and show that both the kernel and the set of maximally applicable pivots of a graph are invariant under this operation. The result is motivated by and applicable to the theory of gene assembly in ciliates.Comment: modest revision (including different latex style) w.r.t. v2, 16 pages, 5 figure

    Models of natural computation : gene assembly and membrane systems

    Get PDF
    This thesis is concerned with two research areas in natural computing: the computational nature of gene assembly and membrane computing. Gene assembly is a process occurring in unicellular organisms called ciliates. During this process genes are transformed through cut-and-paste operations. We study this process from a theoretical point of view. More specifically, we relate the theory of gene assembly to sorting by reversal, which is another well-known theory of DNA transformation. In this way we obtain a novel graph-theoretical representation that provides new insights into the nature of gene assembly. Membrane computing is a computational model inspired by the functioning of membranes in cells. Membrane systems compute in a parallel fashion by moving objects, through membranes, between compartments. We study the computational power of various classes of membrane systems, and also relate them to other well-known models of computation.Netherlands Organisation for Scientific Research (NWO), Institute for Programming research and Algorithmics (IPA)UBL - phd migration 201

    The Pathway to Detangle a Scrambled Gene

    Get PDF
    Programmed DNA elimination and reorganization frequently occur during cellular differentiation. Development of the somatic macronucleus in some ciliates presents an extreme case, involving excision of internal eliminated sequences (IESs) that interrupt coding DNA segments (macronuclear destined sequences, MDSs), as well as removal of transposon-like elements and extensive genome fragmentation, leading to 98% genome reduction in Stylonychia lemnae. Approximately 20-30% of the genes are estimated to be scrambled in the germline micronucleus, with coding segment order permuted and present in either orientation on micronuclear chromosomes. Massive genome rearrangements are therefore critical for development.To understand the process of DNA deletion and reorganization during macronuclear development, we examined the population of DNA molecules during assembly of different scrambled genes in two related organisms in a developmental time-course by PCR. The data suggest that removal of conventional IESs usually occurs first, accompanied by a surprising level of error at this step. The complex events of inversion and translocation seem to occur after repair and excision of all conventional IESs and via multiple pathways.This study reveals a temporal order of DNA rearrangements during the processing of a scrambled gene, with simpler events usually preceding more complex ones. The surprising observation of a hidden layer of errors, absent from the mature macronucleus but present during development, also underscores the need for repair or screening of incorrectly-assembled DNA molecules

    Two Refinements of the Template-Guided DNA Recombination Model of Ciliate Computing

    Get PDF
    To solve the mystery of the intricate gene unscrambling mechanism in ciliates, various theoretical models for this process have been proposed from the point of view of computation. Two main models are the reversible guided recombination system by Kari and Landweber and the template-guided recombination (TGR) system by Prescott, Ehrenfeucht and Rozenberg, based on two categories of DNA recombination: the pointer guided and the template directed recombination respectively. The latter model has been generalized by Daley and McQuillan. In this thesis, we propose a new approach to generate regular languages using the iterated TGR system with a finite initial language and a finite set of templates, that reduces the size of the template language and the alphabet compared to that of the Daley-McQuillan model. To achieve computational completeness using only finite components we also propose an extension of the contextual template-guided recombination system (CTGR system) by Daley and McQuillan, by adding an extra control called permitting contexts on the usage of templates. Then we prove that our proposed system, the CTGR system using permitting contexts, has the capability to characterize the family of recursively enumerable languages using a finite initial language and a finite set of templates. Lastly, we present a comparison and analysis of the computational power of the reversible guided recombination system and the TGR system. Keywords: ciliates, gene unscrambling, in vivo computing, DNA computing, cellular computing, reversible guided recombination, template-guided recombination

    Formal models of the extension activity of DNA polymerase enzymes

    Get PDF
    The study of formal language operations inspired by enzymatic actions on DNA is part of ongoing efforts to provide a formal framework and rigorous treatment of DNA-based information and DNA-based computation. Other studies along these lines include theoretical explorations of splicing systems, insertion-deletion systems, substitution, hairpin extension, hairpin reduction, superposition, overlapping concatenation, conditional concatenation, contextual intra- and intermolecular recombinations, as well as template-guided recombination. First, a formal language operation is proposed and investigated, inspired by the naturally occurring phenomenon of DNA primer extension by a DNA-template-directed DNA polymerase enzyme. Given two DNA strings u and v, where the shorter string v (called the primer) is Watson-Crick complementary and can thus bind to a substring of the longer string u (called the template) the result of the primer extension is a DNA string that is complementary to a suffix of the template which starts at the binding position of the primer. The operation of DNA primer extension can be abstracted as a binary operation on two formal languages: a template language L1 and a primer language L2. This language operation is called L1-directed extension of L2 and the closure properties of various language classes, including the classes in the Chomsky hierarchy, are studied under directed extension. Furthermore, the question of finding necessary and sufficient conditions for a given language of target strings to be generated from a given template language when the primer language is unknown is answered. The canonic inverse of directed extension is used in order to obtain the optimal solution (the minimal primer language) to this question. The second research project investigates properties of the binary string and language operation overlap assembly as defined by Csuhaj-Varju, Petre and Vaszil as a formal model of the linear self-assembly of DNA strands: The overlap assembly of two strings, xy and yz, which share an overlap y, results in the string xyz. In this context, we investigate overlap assembly and its properties: closure properties of various language families under this operation, and related decision problems. A theoretical analysis of the possible use of iterated overlap assembly to generate combinatorial DNA libraries is also given. The third research project continues the exploration of the properties of the overlap assembly operation by investigating closure properties of various language classes under iterated overlap assembly, and the decidability of the completeness of a language. The problem of deciding whether a given string is terminal with respect to a language, and the problem of deciding if a given language can be generated by an overlap assembly operation of two other given languages are also investigated

    Doctor of Philosophy

    Get PDF
    dissertationGenotype Phenotype Association (GPA) is a means to identify candidate genes and genetic variants that may contribute to phenotypic variation. Technological advances in DNA sequencing continue to improve the efficiency and accuracy of GPA. Currently, High Throughput Sequencing (HTS) is the preferred method for GPA as it is fast and economical. HTS allows for population-level characterization of genetic variation, required for GPA studies. Despite the potential power of using HTS in GPA studies, there are technical hurdles that must be overcome. For instance, the excessive error rate in HTS data and the sheer size of population-level data can hinder GPA studies. To overcome these challenges, I have written two software programs for the purpose of HTS GPA. The first toolkit, GPAT++, is designed to detect GPA using small genetic variants. Unlike pervious software, GPAT++'s association test models the inherent errors in HTS, preventing many spurious GPA. The second toolkit, Whole Genome Alignment Metrics (WHAM), was designed for GPA using large genetic variants (structural variants). By integrating both structural variant identification and association testing, WHAM can identify shared structural variants associated with a phenotype. Both GPAT++ and WHAM have been successfully applied to real-world GPA studie
    • …
    corecore