57,128 research outputs found
Experimental Progress in Computation by Self-Assembly of DNA Tilings
Approaches to DNA-based computing by self-assembly require the
use of D. T A nanostructures, called tiles, that have efficient chemistries, expressive
computational power: and convenient input and output (I/O) mechanisms.
We have designed two new classes of DNA tiles: TAO and TAE, both
of which contain three double-helices linked by strand exchange. Structural
analysis of a TAO molecule has shown that the molecule assembles efficiently
from its four component strands. Here we demonstrate a novel method for
I/O whereby multiple tiles assemble around a single-stranded (input) scaffold
strand. Computation by tiling theoretically results in the formation of structures
that contain single-stranded (output) reported strands, which can then
be isolated for subsequent steps of computation if necessary. We illustrate the
advantages of TAO and TAE designs by detailing two examples of massively
parallel arithmetic: construction of complete XOR and addition tables by linear
assemblies of DNA tiles. The three helix structures provide flexibility for
topological routing of strands in the computation: allowing the implementation
of string tile models
Boosting Haplotype Inference with Local Search
Abstract. A very challenging problem in the genetics domain is to infer haplotypes from genotypes. This process is expected to identify genes affecting health, disease and response to drugs. One of the approaches to haplotype inference aims to minimise the number of different haplotypes used, and is known as haplotype inference by pure parsimony (HIPP). The HIPP problem is computationally difficult, being NP-hard. Recently, a SAT-based method (SHIPs) has been proposed to solve the HIPP problem. This method iteratively considers an increasing number of haplotypes, starting from an initial lower bound. Hence, one important aspect of SHIPs is the lower bounding procedure, which reduces the number of iterations of the basic algorithm, and also indirectly simplifies the resulting SAT model. This paper describes the use of local search to improve existing lower bounding procedures. The new lower bounding procedure is guaranteed to be as tight as the existing procedures. In practice the new procedure is in most cases considerably tighter, allowing significant improvement of performance on challenging problem instances.
Algorithmic and Statistical Perspectives on Large-Scale Data Analysis
In recent years, ideas from statistics and scientific computing have begun to
interact in increasingly sophisticated and fruitful ways with ideas from
computer science and the theory of algorithms to aid in the development of
improved worst-case algorithms that are useful for large-scale scientific and
Internet data analysis problems. In this chapter, I will describe two recent
examples---one having to do with selecting good columns or features from a (DNA
Single Nucleotide Polymorphism) data matrix, and the other having to do with
selecting good clusters or communities from a data graph (representing a social
or information network)---that drew on ideas from both areas and that may serve
as a model for exploiting complementary algorithmic and statistical
perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors,
"Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201
ΟBUSS:a parallel BEAST/BEAGLE utility for sequence simulation under complex evolutionary scenarios
Background: Simulated nucleotide or amino acid sequences are frequently used
to assess the performance of phylogenetic reconstruction methods. BEAST, a
Bayesian statistical framework that focuses on reconstructing time-calibrated
molecular evolutionary processes, supports a wide array of evolutionary models,
but lacked matching machinery for simulation of character evolution along
phylogenies.
Results: We present a flexible Monte Carlo simulation tool, called piBUSS,
that employs the BEAGLE high performance library for phylogenetic computations
within BEAST to rapidly generate large sequence alignments under complex
evolutionary models. piBUSS sports a user-friendly graphical user interface
(GUI) that allows combining a rich array of models across an arbitrary number
of partitions. A command-line interface mirrors the options available through
the GUI and facilitates scripting in large-scale simulation studies. Analogous
to BEAST model and analysis setup, more advanced simulation options are
supported through an extensible markup language (XML) specification, which in
addition to generating sequence output, also allows users to combine simulation
and analysis in a single BEAST run.
Conclusions: piBUSS offers a unique combination of flexibility and
ease-of-use for sequence simulation under realistic evolutionary scenarios.
Through different interfaces, piBUSS supports simulation studies ranging from
modest endeavors for illustrative purposes to complex and large-scale
assessments of evolutionary inference procedures. The software aims at
implementing new models and data types that are continuously being developed as
part of BEAST/BEAGLE.Comment: 13 pages, 2 figures, 1 tabl
SOAP3-dp: Fast, Accurate and Sensitive GPU-based Short Read Aligner
To tackle the exponentially increasing throughput of Next-Generation
Sequencing (NGS), most of the existing short-read aligners can be configured to
favor speed in trade of accuracy and sensitivity. SOAP3-dp, through leveraging
the computational power of both CPU and GPU with optimized algorithms, delivers
high speed and sensitivity simultaneously. Compared with widely adopted
aligners including BWA, Bowtie2, SeqAlto, GEM and GPU-based aligners including
BarraCUDA and CUSHAW, SOAP3-dp is two to tens of times faster, while
maintaining the highest sensitivity and lowest false discovery rate (FDR) on
Illumina reads with different lengths. Transcending its predecessor SOAP3,
which does not allow gapped alignment, SOAP3-dp by default tolerates alignment
similarity as low as 60 percent. Real data evaluation using human genome
demonstrates SOAP3-dp's power to enable more authentic variants and longer
Indels to be discovered. Fosmid sequencing shows a 9.1 percent FDR on newly
discovered deletions. SOAP3-dp natively supports BAM file format and provides a
scoring scheme same as BWA, which enables it to be integrated into existing
analysis pipelines. SOAP3-dp has been deployed on Amazon-EC2, NIH-Biowulf and
Tianhe-1A.Comment: 21 pages, 6 figures, submitted to PLoS ONE, additional files
available at "https://www.dropbox.com/sh/bhclhxpoiubh371/O5CO_CkXQE".
Comments most welcom
- β¦