28,827 research outputs found

    Polyominoes Simulating Arbitrary-Neighborhood Zippers and Tilings

    Get PDF
    This paper provides a bridge between the classical tiling theory and the complex neighborhood self-assembling situations that exist in practice. The neighborhood of a position in the plane is the set of coordinates which are considered adjacent to it. This includes classical neighborhoods of size four, as well as arbitrarily complex neighborhoods. A generalized tile system consists of a set of tiles, a neighborhood, and a relation which dictates which are the "admissible" neighboring tiles of a given tile. Thus, in correctly formed assemblies, tiles are assigned positions of the plane in accordance to this relation. We prove that any validly tiled path defined in a given but arbitrary neighborhood (a zipper) can be simulated by a simple "ribbon" of microtiles. A ribbon is a special kind of polyomino, consisting of a non-self-crossing sequence of tiles on the plane, in which successive tiles stick along their adjacent edge. Finally, we extend this construction to the case of traditional tilings, proving that we can simulate arbitrary-neighborhood tilings by simple-neighborhood tilings, while preserving some of their essential properties.Comment: Submitted to Theoretical Computer Scienc

    QuASeR -- Quantum Accelerated De Novo DNA Sequence Reconstruction

    Full text link
    In this article, we present QuASeR, a reference-free DNA sequence reconstruction implementation via de novo assembly on both gate-based and quantum annealing platforms. Each one of the four steps of the implementation (TSP, QUBO, Hamiltonians and QAOA) is explained with simple proof-of-concept examples to target both the genomics research community and quantum application developers in a self-contained manner. The details of the implementation are discussed for the various layers of the quantum full-stack accelerator design. We also highlight the limitations of current classical simulation and available quantum hardware systems. The implementation is open-source and can be found on https://github.com/prince-ph0en1x/QuASeR.Comment: 24 page

    Production and analysis of synthetic Cascade variants

    Get PDF
    CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR assoziiert) ist ein adaptives Immunsystem in Archaeen und Bakterien, das fremdes genetisches Material mit Hilfe von Ribonukleoprotein-Komplexen erkennt und zerstört. Diese Komplexe bestehen aus einer CRISPR RNA (crRNA) und Cas Proteinen. CRISPR-Cas Systeme sind in zwei Hauptklassen und mehrere Typen unterteilt, abhängig von den beteiligten Cas Proteinen. In Typ I Systemen sucht ein Komplex namens Cascade (CRISPR associated complex for antiviral defence) nach eingedrungener viraler DNA während einer Folgeinfektion und bindet die zu der eingebauten crRNA komplementäre Sequenz. Anschließend wird die Nuklease/Helikase Cas3 rekrutiert, welche die virale DNA degradiert (Interferenz). Das Typ I System wird in mehrere Subtypen unterteilt, die Unterschiede im Aufbau von Cascade vorweisen. Im Fokus dieser Arbeit steht eine minimale Cascade-Variante aus Shewanella putrefaciens CN-32. Im Vergleich zur gut untersuchten Typ I-E Cascade aus Escherichia coli fehlen in diesem Komplex zwei Untereinheiten, die gewöhnlicher Weise für die Zielerkennung benötigt werden. Dennoch ist der Komplex aktiv. Rekombinante I-Fv Cascade wurde bereits aus E. coli aufgereinigt und es war möglich, den Komplex zu modifizieren, indem das Rückgrat entweder verlängert oder verkürzt wurde. Dadurch wurden synthetische Varianten mit veränderter Protein-Stöchiometrie erzeugt. In der vorliegenden Arbeit wurde I-Fv Cascade weiter mit in vitro Methoden untersucht. So wurde die Bindung von Ziel-DNA beobachtet und die 3D Struktur zeigt, dass strukturelle Veränderungen im Komplex die fehlenden Untereinheiten ersetzen, möglicherweise um viralen Anti-CRISPR Proteinen zu entgehen. Die Nuklease/Helikase dieses Systems, Cas2/3fv, ist eine Fusion des Cas3 Proteins mit dem Interferenz-unabhängigen Protein Cas2. Ein unabhängiges Cas3fv ohne Cas2 Untereinheit wurde aufgereinigt und in vitro Assays zeigten, dass dieses Protein sowohl freie ssDNA als auch Cascadegebundene Substrate degradiert. Das komplette Cas2/3fv Protein bildet einen Komplex mit dem Protein Cas1 und zeigt eine reduzierte Aktivität gegenüber freier ssDNA, möglicherweise als Regulationsmechanismus zur Vermeidung von unspezifischer Aktivität. Weiterhin wurde ein Prozess namens „RNA wrapping“ etabliert. Synthetische Cascade-Komplexe wurden erzeugt, in denen die grundlegende RNA-Bindung des charakteristischen Cas7fv RückgratProteins auf eine ausgewählte RNA gelenkt wird. Diese spezifische Komplexbildung kann in vivo durch eine Repeat-Sequenz der crRNA stromaufwärts der Zielsequenz und durch Bindung des Cas5fv Proteins initiiert werden. Die erzeugten Komplexe beinhalten die ersten 100 nt der markierten RNA, die anschließend isoliert werden kann. Innerhalb der Komplexe ist die RNA stabilisiert und geschützt vor Degradation durch RNasen. Komplexbildung kann außerdem genutzt werden, um ReportergenTranskripte stillzulegen. Zusätzlich wurden erste Hinweise geliefert, dass das Rückgrat der synthetischen Komplexe durch Fusion mit weiteren Reporterproteinen modifiziert werden kann.CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR associated) is an adaptive immune system of Archaea and Bacteria. It is able to target and destroy foreign genetic material with ribonucleoprotein complexes consisting of CRISPR RNAs (crRNAs) and certain Cas proteins. CRISPR-Cas systems are classified in two major classes and multiple types, according to the involved Cas proteins. In type I systems, a ribonucleoprotein complex called Cascade (CRISPR associated complex for antiviral defence) scans for invading viral DNA during a recurring infection and binds the sequence complementary to the incorporated crRNA. After target recognition, the nuclease/helicase Cas3 is recruited and subsequently destroys the viral DNA in a step termed interfere nce. Multiple subtypes of type I exist that show differences in the Cascade composition. This work focuses on a minimal Cascade variant found in Shewanella putrefaciens CN-32. In comparison to the well-studied type I-E Cascade from Escherichia coli, this complex is missing two proteins usually required for target recognition, yet it is still able to provide immunity. Recombinant I-Fv Cascade was previously purified from E. coli and it was possible to modulate the complex by extending or shortening the backbone, resulting in synthetic variants with altered protein stoichiometry. In the present study, I-Fv Cascade was further analyzed by in vitro methods. Target binding was observed and the 3D structure revealed structural variations that replace the missing subunits, potentially to evade viral anti-CRISPR proteins. The nuclease/helicase of this system, Cas2/3fv, is a fusion of the Cas3 protein with the interference-unrelated protein Cas2. A standalone Cas3fv was purified without the Cas2 domain and in vitro cleavage assays showed that Cas3fv degrades both free ssDNA as well as Cascade-bound substrates. The complete Cas2/3fv protein forms a complex with the protein Cas1 and was shown to reduce cleave of free ssDNA, potentially as a regulatory mechanism against unspecific cleavage. Furthermore, we established a process termed “RNA wrapping”. Synthetic Cascade assemblies can be created by directing the general RNA-binding ability of the characteristic Cas7fv backbone protein on an RNA of choice such as reporter gene transcripts. Specific complex formation can be initiated in vivo by including a repeat sequence from the crRNA upstream a given target sequence and binding of the Cas5fv protein. The created complexes contain the initial 100 nt of the tagged RNA which can be isolated afterwards. While incorporated in complexes, RNA is stabilized and protected from degradation by RNases. Complex formation can be used to silence reporter gene transcripts. Furthermore, we provided initial indications that the backbone of synthetic complexes can be modified by addition of reporter proteins

    Machine learning for crystal identification and discovery

    Full text link
    As computers get faster, researchers -- not hardware or algorithms -- become the bottleneck in scientific discovery. Computational study of colloidal self-assembly is one area that is keenly affected: even after computers generate massive amounts of raw data, performing an exhaustive search to determine what (if any) ordered structures occur in a large parameter space of many simulations can be excruciating. We demonstrate how machine learning can be applied to discover interesting areas of parameter space in colloidal self assembly. We create numerical fingerprints -- inspired by bond orientational order diagrams -- of structures found in self-assembly studies and use these descriptors to both find interesting regions in a phase diagram and identify characteristic local environments in simulations in an automated manner for simple and complex crystal structures. Utilizing these methods allows analysis methods to keep up with the data generation ability of modern high-throughput computing environments.Comment: Fixed typo, added missing acknowledgment, added supplementary informatio

    Optimal Assembly for High Throughput Shotgun Sequencing

    Get PDF
    We present a framework for the design of optimal assembly algorithms for shotgun sequencing under the criterion of complete reconstruction. We derive a lower bound on the read length and the coverage depth required for reconstruction in terms of the repeat statistics of the genome. Building on earlier works, we design a de Brujin graph based assembly algorithm which can achieve very close to the lower bound for repeat statistics of a wide range of sequenced genomes, including the GAGE datasets. The results are based on a set of necessary and sufficient conditions on the DNA sequence and the reads for reconstruction. The conditions can be viewed as the shotgun sequencing analogue of Ukkonen-Pevzner's necessary and sufficient conditions for Sequencing by Hybridization.Comment: 26 pages, 18 figure
    corecore