2,230 research outputs found

    Splicing systems and the Chomsky hierarchy

    Get PDF
    In this paper, we prove decidability properties and new results on the position of the family of languages generated by (circular) splicing systems within the Chomsky hierarchy. The two main results of the paper are the following. First, we show that it is decidable, given a circular splicing language and a regular language, whether they are equal. Second, we prove the language generated by an alphabetic splicing system is context-free. Alphabetic splicing systems are a generalization of simple and semi-simple splicin systems already considered in the literature

    Splicing Systems from Past to Future: Old and New Challenges

    Full text link
    A splicing system is a formal model of a recombinant behaviour of sets of double stranded DNA molecules when acted on by restriction enzymes and ligase. In this survey we will concentrate on a specific behaviour of a type of splicing systems, introduced by P\u{a}un and subsequently developed by many researchers in both linear and circular case of splicing definition. In particular, we will present recent results on this topic and how they stimulate new challenging investigations.Comment: Appeared in: Discrete Mathematics and Computer Science. Papers in Memoriam Alexandru Mateescu (1952-2005). The Publishing House of the Romanian Academy, 2014. arXiv admin note: text overlap with arXiv:1112.4897 by other author

    Formal models of the extension activity of DNA polymerase enzymes

    Get PDF
    The study of formal language operations inspired by enzymatic actions on DNA is part of ongoing efforts to provide a formal framework and rigorous treatment of DNA-based information and DNA-based computation. Other studies along these lines include theoretical explorations of splicing systems, insertion-deletion systems, substitution, hairpin extension, hairpin reduction, superposition, overlapping concatenation, conditional concatenation, contextual intra- and intermolecular recombinations, as well as template-guided recombination. First, a formal language operation is proposed and investigated, inspired by the naturally occurring phenomenon of DNA primer extension by a DNA-template-directed DNA polymerase enzyme. Given two DNA strings u and v, where the shorter string v (called the primer) is Watson-Crick complementary and can thus bind to a substring of the longer string u (called the template) the result of the primer extension is a DNA string that is complementary to a suffix of the template which starts at the binding position of the primer. The operation of DNA primer extension can be abstracted as a binary operation on two formal languages: a template language L1 and a primer language L2. This language operation is called L1-directed extension of L2 and the closure properties of various language classes, including the classes in the Chomsky hierarchy, are studied under directed extension. Furthermore, the question of finding necessary and sufficient conditions for a given language of target strings to be generated from a given template language when the primer language is unknown is answered. The canonic inverse of directed extension is used in order to obtain the optimal solution (the minimal primer language) to this question. The second research project investigates properties of the binary string and language operation overlap assembly as defined by Csuhaj-Varju, Petre and Vaszil as a formal model of the linear self-assembly of DNA strands: The overlap assembly of two strings, xy and yz, which share an overlap y, results in the string xyz. In this context, we investigate overlap assembly and its properties: closure properties of various language families under this operation, and related decision problems. A theoretical analysis of the possible use of iterated overlap assembly to generate combinatorial DNA libraries is also given. The third research project continues the exploration of the properties of the overlap assembly operation by investigating closure properties of various language classes under iterated overlap assembly, and the decidability of the completeness of a language. The problem of deciding whether a given string is terminal with respect to a language, and the problem of deciding if a given language can be generated by an overlap assembly operation of two other given languages are also investigated

    Linear splicing and syntactic monoid

    Get PDF
    AbstractSplicing systems were introduced by Head in 1987 as a formal counterpart of a biological mechanism of DNA recombination under the action of restriction and ligase enzymes. Despite the intensive studies on linear splicing systems, some elementary questions about their computational power are still open. In particular, in this paper we face the problem of characterizing the proper subclass of regular languages which are generated by finite (Paun) linear splicing systems. We introduce here the class of marker languages L, i.e., regular languages with the form L=L1[x]1L2, where L1,L2 are regular languages, [x] is a syntactic congruence class satisfying special conditions and [x]1 is either equal to [x] or equal to [x]∪{1}, 1 being the empty word. Using classical properties of formal language theory, we give an algorithm which allows us to decide whether a regular language is a marker language. Furthermore, for each marker language L we exhibit a finite Paun linear splicing system and we prove that this system generates L

    A C++-embedded Domain-Specific Language for programming the MORA soft processor array

    Get PDF
    MORA is a novel platform for high-level FPGA programming of streaming vector and matrix operations, aimed at multimedia applications. It consists of soft array of pipelined low-complexity SIMD processors-in-memory (PIM). We present a Domain-Specific Language (DSL) for high-level programming of the MORA soft processor array. The DSL is embedded in C++, providing designers with a familiar language framework and the ability to compile designs using a standard compiler for functional testing before generating the FPGA bitstream using the MORA toolchain. The paper discusses the MORA-C++ DSL and the compilation route into the assembly for the MORA machine and provides examples to illustrate the programming model and performance

    Finite Models of Splicing and Their Complexity

    Get PDF
    Durante las dos últimas décadas ha surgido una colaboración estrecha entre informáticos, bioquímicos y biólogos moleculares, que ha dado lugar a la investigación en un área conocida como la computación biomolecular. El trabajo en esta tesis pertenece a este área, y estudia un modelo de cómputo llamado sistema de empalme (splicing system). El empalme es el modelo formal del corte y de la recombinación de las moléculas de ADN bajo la influencia de las enzimas de la restricción.Esta tesis presenta el trabajo original en el campo de los sistemas de empalme, que, como ya indica el título, se puede dividir en dos partes. La primera parte introduce y estudia nuevos modelos finitos de empalme. La segunda investiga aspectos de complejidad (tanto computacional como descripcional) de los sistema de empalme. La principal contribución de la primera parte es que pone en duda la asunción general que una definición finita, más realista de sistemas de empalme es necesariamente débil desde un punto de vista computacional. Estudiamos varios modelos alternativos y demostramos que en muchos casos tienen más poder computacional. La segunda parte de la tesis explora otro territorio. El modelo de empalme se ha estudiado mucho respecto a su poder computacional, pero las consideraciones de complejidad no se han tratado apenas. Introducimos una noción de la complejidad temporal y espacial para los sistemas de empalme. Estas definiciones son utilizadas para definir y para caracterizar las clases de complejidad para los sistemas de empalme. Entre otros resultados, presentamos unas caracterizaciones exactas de las clases de empalme en términos de clases de máquina de Turing conocidas. Después, usando una nueva variante de sistemas de empalme, que acepta lenguajes en lugar de generarlos, demostramos que los sistemas de empalme se pueden usar para resolver problemas. Por último, definimos medidas de complejidad descriptional para los sistemas de empalme. Demostramos que en este respecto los sistemas de empalme finitos tienen buenas propiedades comparadosOver the last two decades, a tight collaboration has emerged between computer scientists, biochemists and molecular biologists, which has spurred research into an area known as DNAComputing (also biomolecular computing). The work in this thesis belongs to this field, and studies a computational model called splicing system. Splicing is the formal model of the cutting and recombination of DNA molecules under the influence of restriction enzymes.This thesis presents original work in the field of splicing systems, which, as the title already indicates, can be roughly divided into two parts: 'Finite models of splicing' on the onehand and 'their complexity' on the other. The main contribution of the first part is that it challenges the general assumption that a finite, more realistic definition of splicing is necessarily weal from a computational point of view. We propose and study various alternative models and show that in most cases they have more computational power, often reaching computational completeness. The second part explores other territory. Splicing research has been mainly focused on computational power, but complexity considerations have hardly been addressed. Here we introduce notions of time and space complexity for splicing systems. These definitions are used to characterize splicing complexity classes in terms of well known Turing machine classes. Then, using a new accepting variant of splicing systems, we show that they can also be used as problem solvers. Finally, we study descriptional complexity. We define measures of descriptional complexity for splicing systems and show that for representing regular languages they have good properties with respect to finite automata, especially in the accepting variant

    Word Blending and Other Formal Models of Bio-operations

    Get PDF
    As part of ongoing efforts to view biological processes as computations, several formal models of DNA-based processes have been proposed and studied in the formal language literature. In this thesis, we survey some classical formal language word and language operations, as well as several bio-operations, and we propose a new operation inspired by a DNA recombination lab protocol known as Cross-pairing Polymerase Chain Reaction, or XPCR. More precisely, we define and study a word operation called word blending which models a special case of XPCR, where two words x w p and q w y sharing a non-empty overlap part w generate the word x w y. Properties of word blending that we study include closure properties of the Chomsky families of languages under this operation and its iterated version, existence of solution to equations involving this operation, and its state complexity

    The Tandem Duplication Distance Is NP-Hard

    Get PDF
    In computational biology, tandem duplication is an important biological phenomenon which can occur either at the genome or at the DNA level. A tandem duplication takes a copy of a genome segment and inserts it right after the segment - this can be represented as the string operation AXB ? AXXB. Tandem exon duplications have been found in many species such as human, fly or worm, and have been largely studied in computational biology. The Tandem Duplication (TD) distance problem we investigate in this paper is defined as follows: given two strings S and T over the same alphabet, compute the smallest sequence of tandem duplications required to convert S to T. The natural question of whether the TD distance can be computed in polynomial time was posed in 2004 by Leupold et al. and had remained open, despite the fact that tandem duplications have received much attention ever since. In this paper, we prove that this problem is NP-hard, settling the 16-year old open problem. We further show that this hardness holds even if all characters of S are distinct. This is known as the exemplar TD distance, which is of special relevance in bioinformatics. One of the tools we develop for the reduction is a new problem called the Cost-Effective Subgraph, for which we obtain W[1]-hardness results that might be of independent interest. We finally show that computing the exemplar TD distance between S and T is fixed-parameter tractable. Our results open the door to many other questions, and we conclude with several open problems

    GRAPHICAL USER INTERFACE FOR BOUNDED-ADDITION FUZZY SPLICING SYSTEMS AND THEIR VARIANTS

    Get PDF
    A splicing system is one of the early theoretical proposals of the DNA-based computation device. The splicing operation starts when two DNA molecules are cut at specific subsequences with the presence of restriction enzymes: the first part is then connected to the second part of the other molecule, or vice versa, to produce splicing languages. Fuzzy with bounded-addition operation has been introduced as a restriction in splicing systems to increase the generative power of the languages generated. In this research, a graphical user interface is developed to generate all the splicing languages generated by bounded-addition fuzzy splicing systems and their variants. An algorithm is developed using JAVA and Visual Studio Code software in order to replace the time-consuming manual computation of the languages generated by bounded-addition fuzzy DNA splicing systems and their variants

    Acta Cybernetica : Volume 12. Number 4.

    Get PDF
    corecore