28,827 research outputs found
Polyominoes Simulating Arbitrary-Neighborhood Zippers and Tilings
This paper provides a bridge between the classical tiling theory and the
complex neighborhood self-assembling situations that exist in practice. The
neighborhood of a position in the plane is the set of coordinates which are
considered adjacent to it. This includes classical neighborhoods of size four,
as well as arbitrarily complex neighborhoods. A generalized tile system
consists of a set of tiles, a neighborhood, and a relation which dictates which
are the "admissible" neighboring tiles of a given tile. Thus, in correctly
formed assemblies, tiles are assigned positions of the plane in accordance to
this relation. We prove that any validly tiled path defined in a given but
arbitrary neighborhood (a zipper) can be simulated by a simple "ribbon" of
microtiles. A ribbon is a special kind of polyomino, consisting of a
non-self-crossing sequence of tiles on the plane, in which successive tiles
stick along their adjacent edge. Finally, we extend this construction to the
case of traditional tilings, proving that we can simulate
arbitrary-neighborhood tilings by simple-neighborhood tilings, while preserving
some of their essential properties.Comment: Submitted to Theoretical Computer Scienc
QuASeR -- Quantum Accelerated De Novo DNA Sequence Reconstruction
In this article, we present QuASeR, a reference-free DNA sequence
reconstruction implementation via de novo assembly on both gate-based and
quantum annealing platforms. Each one of the four steps of the implementation
(TSP, QUBO, Hamiltonians and QAOA) is explained with simple proof-of-concept
examples to target both the genomics research community and quantum application
developers in a self-contained manner. The details of the implementation are
discussed for the various layers of the quantum full-stack accelerator design.
We also highlight the limitations of current classical simulation and available
quantum hardware systems. The implementation is open-source and can be found on
https://github.com/prince-ph0en1x/QuASeR.Comment: 24 page
Production and analysis of synthetic Cascade variants
CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR assoziiert) ist ein
adaptives Immunsystem in Archaeen und Bakterien, das fremdes genetisches Material mit Hilfe von
Ribonukleoprotein-Komplexen erkennt und zerstört. Diese Komplexe bestehen aus einer CRISPR RNA
(crRNA) und Cas Proteinen. CRISPR-Cas Systeme sind in zwei Hauptklassen und mehrere Typen
unterteilt, abhängig von den beteiligten Cas Proteinen. In Typ I Systemen sucht ein Komplex namens
Cascade (CRISPR associated complex for antiviral defence) nach eingedrungener viraler DNA während
einer Folgeinfektion und bindet die zu der eingebauten crRNA komplementäre Sequenz. Anschließend
wird die Nuklease/Helikase Cas3 rekrutiert, welche die virale DNA degradiert (Interferenz).
Das Typ I System wird in mehrere Subtypen unterteilt, die Unterschiede im Aufbau von Cascade
vorweisen. Im Fokus dieser Arbeit steht eine minimale Cascade-Variante aus Shewanella putrefaciens
CN-32. Im Vergleich zur gut untersuchten Typ I-E Cascade aus Escherichia coli fehlen in diesem Komplex
zwei Untereinheiten, die gewöhnlicher Weise für die Zielerkennung benötigt werden. Dennoch ist der
Komplex aktiv. Rekombinante I-Fv Cascade wurde bereits aus E. coli aufgereinigt und es war möglich,
den Komplex zu modifizieren, indem das Rückgrat entweder verlängert oder verkürzt wurde. Dadurch
wurden synthetische Varianten mit veränderter Protein-Stöchiometrie erzeugt.
In der vorliegenden Arbeit wurde I-Fv Cascade weiter mit in vitro Methoden untersucht. So wurde die
Bindung von Ziel-DNA beobachtet und die 3D Struktur zeigt, dass strukturelle Veränderungen im
Komplex die fehlenden Untereinheiten ersetzen, möglicherweise um viralen Anti-CRISPR Proteinen zu
entgehen. Die Nuklease/Helikase dieses Systems, Cas2/3fv, ist eine Fusion des Cas3 Proteins mit dem
Interferenz-unabhängigen Protein Cas2. Ein unabhängiges Cas3fv ohne Cas2 Untereinheit wurde
aufgereinigt und in vitro Assays zeigten, dass dieses Protein sowohl freie ssDNA als auch Cascadegebundene Substrate degradiert. Das komplette Cas2/3fv Protein bildet einen Komplex mit dem Protein
Cas1 und zeigt eine reduzierte Aktivität gegenüber freier ssDNA, möglicherweise als
Regulationsmechanismus zur Vermeidung von unspezifischer Aktivität.
Weiterhin wurde ein Prozess namens „RNA wrapping“ etabliert. Synthetische Cascade-Komplexe
wurden erzeugt, in denen die grundlegende RNA-Bindung des charakteristischen Cas7fv RückgratProteins auf eine ausgewählte RNA gelenkt wird. Diese spezifische Komplexbildung kann in vivo durch
eine Repeat-Sequenz der crRNA stromaufwärts der Zielsequenz und durch Bindung des Cas5fv Proteins
initiiert werden. Die erzeugten Komplexe beinhalten die ersten 100 nt der markierten RNA, die
anschließend isoliert werden kann. Innerhalb der Komplexe ist die RNA stabilisiert und geschützt vor
Degradation durch RNasen. Komplexbildung kann außerdem genutzt werden, um ReportergenTranskripte stillzulegen. Zusätzlich wurden erste Hinweise geliefert, dass das Rückgrat der synthetischen
Komplexe durch Fusion mit weiteren Reporterproteinen modifiziert werden kann.CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR associated) is an
adaptive immune system of Archaea and Bacteria. It is able to target and destroy foreign genetic
material with ribonucleoprotein complexes consisting of CRISPR RNAs (crRNAs) and certain Cas proteins.
CRISPR-Cas systems are classified in two major classes and multiple types, according to the involved Cas
proteins. In type I systems, a ribonucleoprotein complex called Cascade (CRISPR associated complex for
antiviral defence) scans for invading viral DNA during a recurring infection and binds the sequence
complementary to the incorporated crRNA. After target recognition, the nuclease/helicase Cas3 is
recruited and subsequently destroys the viral DNA in a step termed interfere nce.
Multiple subtypes of type I exist that show differences in the Cascade composition. This work focuses on
a minimal Cascade variant found in Shewanella putrefaciens CN-32. In comparison to the well-studied
type I-E Cascade from Escherichia coli, this complex is missing two proteins usually required for target
recognition, yet it is still able to provide immunity. Recombinant I-Fv Cascade was previously purified
from E. coli and it was possible to modulate the complex by extending or shortening the backbone,
resulting in synthetic variants with altered protein stoichiometry.
In the present study, I-Fv Cascade was further analyzed by in vitro methods. Target binding was
observed and the 3D structure revealed structural variations that replace the missing subunits,
potentially to evade viral anti-CRISPR proteins. The nuclease/helicase of this system, Cas2/3fv, is a fusion
of the Cas3 protein with the interference-unrelated protein Cas2. A standalone Cas3fv was purified
without the Cas2 domain and in vitro cleavage assays showed that Cas3fv degrades both free ssDNA as
well as Cascade-bound substrates. The complete Cas2/3fv protein forms a complex with the protein
Cas1 and was shown to reduce cleave of free ssDNA, potentially as a regulatory mechanism against
unspecific cleavage.
Furthermore, we established a process termed “RNA wrapping”. Synthetic Cascade assemblies can be
created by directing the general RNA-binding ability of the characteristic Cas7fv backbone protein on an
RNA of choice such as reporter gene transcripts. Specific complex formation can be initiated in vivo by
including a repeat sequence from the crRNA upstream a given target sequence and binding of the
Cas5fv protein. The created complexes contain the initial 100 nt of the tagged RNA which can be
isolated afterwards. While incorporated in complexes, RNA is stabilized and protected from degradation
by RNases. Complex formation can be used to silence reporter gene transcripts. Furthermore, we
provided initial indications that the backbone of synthetic complexes can be modified by addition of
reporter proteins
Machine learning for crystal identification and discovery
As computers get faster, researchers -- not hardware or algorithms -- become
the bottleneck in scientific discovery. Computational study of colloidal
self-assembly is one area that is keenly affected: even after computers
generate massive amounts of raw data, performing an exhaustive search to
determine what (if any) ordered structures occur in a large parameter space of
many simulations can be excruciating. We demonstrate how machine learning can
be applied to discover interesting areas of parameter space in colloidal self
assembly. We create numerical fingerprints -- inspired by bond orientational
order diagrams -- of structures found in self-assembly studies and use these
descriptors to both find interesting regions in a phase diagram and identify
characteristic local environments in simulations in an automated manner for
simple and complex crystal structures. Utilizing these methods allows analysis
methods to keep up with the data generation ability of modern high-throughput
computing environments.Comment: Fixed typo, added missing acknowledgment, added supplementary
informatio
Optimal Assembly for High Throughput Shotgun Sequencing
We present a framework for the design of optimal assembly algorithms for
shotgun sequencing under the criterion of complete reconstruction. We derive a
lower bound on the read length and the coverage depth required for
reconstruction in terms of the repeat statistics of the genome. Building on
earlier works, we design a de Brujin graph based assembly algorithm which can
achieve very close to the lower bound for repeat statistics of a wide range of
sequenced genomes, including the GAGE datasets. The results are based on a set
of necessary and sufficient conditions on the DNA sequence and the reads for
reconstruction. The conditions can be viewed as the shotgun sequencing analogue
of Ukkonen-Pevzner's necessary and sufficient conditions for Sequencing by
Hybridization.Comment: 26 pages, 18 figure
- …