150 research outputs found

    Active Self-Assembly of Simple Units Using an Insertion Primitive

    Get PDF
    While computer science has given us a framework for determining the complexity and difficulty of solving computational problems, we do not yet have a theoretical framework for knowing what actions, behaviors, and life-like qualities can emerge from a given set of simple modular units. There has been much interest in developing models for programming active self-assembly processes in both the reconfigurable robotics community and the nanotechnology community. With respect to materials science and nanotechnology, the models proposed to date are either not yet implementable with our current understanding of synthetic chemistry or those that are implementable are limited to a set of features that do not capture the power of active components. Prior implementable models of molecular assembly only considered the passive behaviors of attaching and detaching from a complex. Inspired by the algorithmic tile assembly model [Winfree, 1996] and the graph grammar assembly model [Klavins et al., 2004], we describe a formal model for studying the complexity of self-assembled structures with active molecular components. In particular, we add an insertion primitive and we show a direct mapping of our model to a molecular implementation using DNA. We show that the expressive power of this language is stronger than regular languages, but at most as strong as context free grammars. Here, we explore the trade-off between the complexity of the system (in terms of the number of unit types), and the behavior of the system and speed of its assembly. We find that we can grow a line of any given length n in expected time O(log^3 n) using O(log^2 n) monomers. If we grow a line with k insertion rules, either the expected final length is infinite or the expected length at time t is at most (2t+2)^k^2, which is polynomial in t

    Parallel Pairwise Operations on Data Stored in DNA: Sorting, Shifting, and Searching

    Get PDF
    Prior research has introduced the Single-Instruction-Multiple-Data paradigm for DNA computing (SIMD DNA). It offers the potential for storing information and performing in-memory computations on DNA, with massive parallelism. This paper introduces three new SIMD DNA operations: sorting, shifting, and searching. Each is a fundamental operation in computer science. Our implementations demonstrate the effectiveness of parallel pairwise operations with this new paradigm

    Formal Design and Analysis for DNA Implementations of Chemical Reaction Networks

    Get PDF
    In molecular programming, the Chemical Reaction Network model is often used to describe systems of interacting molecules. This model can describe either real systems, allowing us to analyze and determine their computational function; or describe hypothetical systems, with known computational function but perhaps no known physical example. One significant breakthrough in the field is that any Chemical Reaction Network can be approximated by a system using DNA Strand Displacement mechanisms. This allows the Chemical Reaction Network model to be treated like a programming language, where programs can be written in the abstract and then compiled into physical molecules. Given a programming language and a proof-of-concept compiler, one would want to take the compiler from the proof-of-concept stage into a more reliable, more systematic, and better understood process. This thesis is made up of my contributions to that effort. First, given a programming language and a compiler, it would be useful to formally verify that the compiler is correct. My collaborators, Qing Dong and Erik Winfree, and I defined a Chemical Reaction Network-specific form of bisimulation equivalence, which can compare two such networks and verify that one is (or is not) a correct implementation of the other. For example, the compiler-produced DNA circuit can be verified as an implementation of its abstract program, although this is not the only possible use. After defining this concept of equivalence, we show that it can be checked by algorithm; although various parts of the problem are NP-complete or PSPACE-complete, we give algorithms that meet these lower bounds. We also prove a number of interesting properties of Chemical Reaction Network bisimulation equivalence, including transitivity and modularity properties which are particularly useful for stepwise checking of large systems. We further extend this bisimulation method to linear Polymer Reaction Networks, a strictly more powerful abstraction which has been occasionally used in molecular programming. Again we prove complexity hardness results, which in this case are as expected uncomputable in the general case; however, many practical systems can still be verified, and we give one such example. Finally, we use bisimulation to identify a class of single-locus networks that are practical to implement. Thus we show a method of verification which can simplify use of the above-mentioned compiler by proving general statements of correctness about its results. Second, given a programming language and a concept of compiling it, it would be useful to optimize the result of the compilation. One particular area of optimization is the number of DNA strands per prepared complex; some experiments suggest that systems with no more than 2 strands per complex are more robust. Lulu Qian and I developed some proposed DNA Strand Displacement schemes for general Chemical Reaction Network implementations with no more than 2 strands per complex, and a number of other desirable properties. Meanwhile, having been shown to be useful for many reasons, the mechanisms of DNA Strand Displacement have recently been formalized, abstracted, and analyzed. I show that this formalization, combined with the bisimulation methods above, can prove various statements about the limits of DNA Strand Displacement systems. For example, a set of desirable conditions including the 2-strand limit cannot be achieved by any general Chemical Reaction Network implementation scheme. I also observe that two of the new schemes we discovered, each meeting all but one condition of the impossible set, were found in the process of coming up with this proof. I thus argue that through formalization of DNA Strand Displacement we can have a more systematic method of finding and designing molecular programs, and of knowing when the programs we want do not exist.</p

    Fully-Functional Suffix Trees and Optimal Text Searching in BWT-runs Bounded Space

    Get PDF
    Indexing highly repetitive texts - such as genomic databases, software repositories and versioned text collections - has become an important problem since the turn of the millennium. A relevant compressibility measure for repetitive texts is r, the number of runs in their Burrows-Wheeler Transforms (BWTs). One of the earliest indexes for repetitive collections, the Run-Length FM-index, used O(r) space and was able to efficiently count the number of occurrences of a pattern of length m in the text (in loglogarithmic time per pattern symbol, with current techniques). However, it was unable to locate the positions of those occurrences efficiently within a space bounded in terms of r. In this paper we close this long-standing problem, showing how to extend the Run-Length FM-index so that it can locate the occ occurrences efficiently within O(r) space (in loglogarithmic time each), and reaching optimal time, O(m + occ), within O(r log log w ({\sigma} + n/r)) space, for a text of length n over an alphabet of size {\sigma} on a RAM machine with words of w = {\Omega}(log n) bits. Within that space, our index can also count in optimal time, O(m). Multiplying the space by O(w/ log {\sigma}), we support count and locate in O(dm log({\sigma})/we) and O(dm log({\sigma})/we + occ) time, which is optimal in the packed setting and had not been obtained before in compressed space. We also describe a structure using O(r log(n/r)) space that replaces the text and extracts any text substring of length ` in almost-optimal time O(log(n/r) + ` log({\sigma})/w). Within that space, we similarly provide direct access to suffix array, inverse suffix array, and longest common prefix array cells, and extend these capabilities to full suffix tree functionality, typically in O(log(n/r)) time per operation.Comment: submitted version; optimal count and locate in smaller space: O(r log log_w(n/r + sigma)

    Modular design and analysis of synthetic biochemical networks

    Get PDF

    Scaling up genetic circuit design for cellular computing:advances and prospects

    Get PDF

     

    Get PDF

    DNA Computing: Modelling in Formal Languages and Combinatorics on Words, and Complexity Estimation

    Get PDF
    DNA computing, an essential area of unconventional computing research, encodes problems using DNA molecules and solves them using biological processes. This thesis contributes to the theoretical research in DNA computing by modelling biological processes as computations and by studying formal language and combinatorics on words concepts motivated by DNA processes. It also contributes to the experimental research in DNA computing by a scaling comparison between DNA computing and other models of computation. First, for theoretical DNA computing research, we propose a new word operation inspired by a DNA wet lab protocol called cross-pairing polymerase chain reaction (XPCR). We define and study a word operation called word blending that models and generalizes an unexpected outcome of XPCR. The input words are uwx and ywv that share a non-empty overlap w, and the output is the word uwv. Closure properties of the Chomsky families of languages under this operation and its iterated version, the existence of a solution to equations involving this operation, and its state complexity are studied. To follow the XPCR experimental requirement closely, a new word operation called conjugate word blending is defined, where the subwords x and y are required to be identical. Closure properties of the Chomsky families of languages under this operation and the XPCR experiments that motivate and implement it are presented. Second, we generalize the sequence of Fibonacci words inspired by biological concepts on DNA. The sequence of Fibonacci words is an infinite sequence of words obtained from two initial letters f(1) = a and f(2)= b, by the recursive definition f(n+2) = f(n+1)*f(n), for all positive integers n, where * denotes word concatenation. After we propose a unified terminology for different types of Fibonacci words and corresponding results in the extensive literature on the topic, we define and explore involutive Fibonacci words motivated by ideas stemming from theoretical studies of DNA computing. The relationship between different involutive Fibonacci words and their borderedness and primitivity are studied. Third, we analyze the practicability of DNA computing experiments since DNA computing and other unconventional computing methods that solve computationally challenging problems often have the limitation that the space of potential solutions grows exponentially with their sizes. For such problems, DNA computing algorithms may achieve a linear time complexity with an exponential space complexity as a trade-off. Using the subset sum problem as the benchmark problem, we present a scaling comparison of the DNA computing (DNA-C) approach with the network biocomputing (NB-C) and the electronic computing (E-C) approaches, where the volume, computing time, and energy required, relative to the input size, are compared. Our analysis shows that E-C uses a tiny volume compared to that required by DNA-C and NB-C, at the cost of the E-C computing time being outperformed first by DNA-C and then by NB-C. In addition, NB-C appears to be more energy efficient than DNA-C for some input sets, and E-C is always an order of magnitude less energy efficient than DNA-C
    corecore