Search CORE

17,254 research outputs found

Recommended from our members

Computational Strategies for Scalable Genomics Analysis.

Author: Shi Lizhen
Wang Zhong
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

The revolution in next-generation DNA sequencing technologies is leading to explosive data growth in genomics, posing a significant challenge to the computing infrastructure and software algorithms for genomics analysis. Various big data technologies have been explored to scale up/out current bioinformatics solutions to mine the big genomics data. In this review, we survey some of these exciting developments in the applications of parallel distributed computing and special hardware to genomics. We comment on the pros and cons of each strategy in the context of ease of development, robustness, scalability, and efficiency. Although this review is written for an audience from the genomics and bioinformatics fields, it may also be informative for the audience of computer science with interests in genomics applications

eScholarship - University of California

Counting absolute number of molecules using unique molecular identifiers

Author: Anna V&#xe4
Jussi Taipale
Kasper Karlsson
Martin Bonke
Sten Linnarsson
Teemu Kivioja
Publication venue
Publication date: 14/04/2011
Field of study

Advances in molecular biology have made it easy to identify different DNA or RNA species and to copy them. Identification of nucleic acid species can be accomplished by reading the DNA sequence; currently millions of molecules can be sequenced in a single day using massively parallel sequencing. Efficient copying of DNA-molecules of arbitrary sequence was made possible by molecular cloning, and the polymerase chain reaction. Differences in the relative abundance of a large number of different sequences between two or more samples can in turn be measured using microarray hybridization and/or tag sequencing. However, determining the relative abundance of two different species and/or the absolute number of molecules present in a single sample has proven much more challenging. This is because it is hard to detect individual molecules without copying them, and even harder to make defined number of copies of molecules. We show here that this limitation can be overcome by using unique molecular identifiers (umis), which make each molecule in the sample distinct

Nature Precedings

Large Eddy Simulations of gaseous flames in gas turbine combustion chambers

Author: Gicquel Laurent Y.M.
Poinsot Thierry
Staffelbach Gabriel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Recent developments in numerical schemes, turbulent combustion models and the regular increase of computing power allow Large Eddy Simulation (LES) to be applied to real industrial burners. In this paper, two types of LES in complex geometry combustors and of specific interest for aeronautical gas turbine burners are reviewed: (1) laboratory-scale combustors, without compressor or turbine, in which advanced measurements are possible and (2) combustion chambers of existing engines operated in realistic operating conditions. Laboratory-scale burners are designed to assess modeling and funda- mental flow aspects in controlled configurations. They are necessary to gauge LES strategies and identify potential limitations. In specific circumstances, they even offer near model-free or DNS-like LES computations. LES in real engines illustrate the potential of the approach in the context of industrial burners but are more difficult to validate due to the limited set of available measurements. Usual approaches for turbulence and combustion sub-grid models including chemistry modeling are first recalled. Limiting cases and range of validity of the models are specifically recalled before a discussion on the numerical breakthrough which have allowed LES to be applied to these complex cases. Specific issues linked to real gas turbine chambers are discussed: multi-perforation, complex acoustic impedances at inlet and outlet, annular chambers.. Examples are provided for mean flow predictions (velocity, temperature and species) as well as unsteady mechanisms (quenching, ignition, combustion instabil- ities). Finally, potential perspectives are proposed to further improve the use of LES for real gas turbine combustor designs

Open Archive Toulouse Archive Ouverte

ToPoliNano: Nanoarchitectures Design Made Real

Author: Chiabrando Diego
Frache Stefano
Graziano Mariagrazia
Riente Fabrizio
Turvani Giovanna
Zamboni Maurizio
Publication venue: IEEE
Publication date: 01/01/2012
Field of study

Many facts about emerging nanotechnologies are yet to be assessed. There are still major concerns, for instance, about maximum achievable device density, or about which architecture is best fit for a specific application. Growing complexity requires taking into account many aspects of technology, application and architecture at the same time. Researchers face problems that are not new per se, but are now subject to very different constraints, that need to be captured by design tools. Among the emerging nanotechnologies, two-dimensional nanowire based arrays represent promising nanostructures, especially for massively parallel computing architectures. Few attempts have been done, aimed at giving the possibility to explore architectural solutions, deriving information from extensive and reliable nanoarray characterization. Moreover, in the nanotechnology arena there is still not a clear winner, so it is important to be able to target different technologies, not to miss the next big thing. We present a tool, ToPoliNano, that enables such a multi-technological characterization in terms of logic behavior, power and timing performance, area and layout constraints, on the basis of specific technological and topological descriptions. This tool can aid the design process, beside providing a comprehensive simulation framework for DC and timing simulations, and detailed power analysis. Design and simulation results will be shown for nanoarray-based circuits. ToPoliNano is the first real design tool that tackles the top down design of a circuit based on emerging technologie

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Accurate Profiling of Microbial Communities from Massively Parallel Sequencing using Convex Optimization

Author: A. Amir
B.J. Paster
C. Lozupone
C.A. Lozupone
D. Hiller
D. Kessner
D.H. Haft
E.R. Mardis
I. Eskin
J.R. Cole
M. Grant
M. Hamady
N. Segata
P. Meinicke
P.B. Eckburg
S. Pavoine
T.J. Gentry
T.Z. DeSantis
Publication venue
Publication date: 01/01/2013
Field of study

We describe the Microbial Community Reconstruction ({\bf MCR}) Problem, which is fundamental for microbiome analysis. In this problem, the goal is to reconstruct the identity and frequency of species comprising a microbial community, using short sequence reads from Massively Parallel Sequencing (MPS) data obtained for specified genomic regions. We formulate the problem mathematically as a convex optimization problem and provide sufficient conditions for identifiability, namely the ability to reconstruct species identity and frequency correctly when the data size (number of reads) grows to infinity. We discuss different metrics for assessing the quality of the reconstructed solution, including a novel phylogenetically-aware metric based on the Mahalanobis distance, and give upper-bounds on the reconstruction error for a finite number of reads under different metrics. We propose a scalable divide-and-conquer algorithm for the problem using convex optimization, which enables us to handle large problems (with

\sim10^6

species). We show using numerical simulations that for realistic scenarios, where the microbial communities are sparse, our algorithm gives solutions with high accuracy, both in terms of obtaining accurate frequency, and in terms of species phylogenetic resolution.Comment: To appear in SPIRE 1

arXiv.org e-Print Archive

CiteSeerX

Crossref

A highly parameterized and efficient FPGA-based skeleton for pairwise biological sequence alignment

Author: Benkrid Abdsamad
Benkrid K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2009
Field of study

Portsmouth University Research Portal (Pure)

Experimental Progress in Computation by Self-Assembly of DNA Tilings

Author: LaBean Thomas H.
Reif John H.
Winfree Erik
Publication venue: 'American Mathematical Society (AMS)'
Publication date: 01/01/1999
Field of study

Approaches to DNA-based computing by self-assembly require the use of D. T A nanostructures, called tiles, that have efficient chemistries, expressive computational power: and convenient input and output (I/O) mechanisms. We have designed two new classes of DNA tiles: TAO and TAE, both of which contain three double-helices linked by strand exchange. Structural analysis of a TAO molecule has shown that the molecule assembles efficiently from its four component strands. Here we demonstrate a novel method for I/O whereby multiple tiles assemble around a single-stranded (input) scaffold strand. Computation by tiling theoretically results in the formation of structures that contain single-stranded (output) reported strands, which can then be isolated for subsequent steps of computation if necessary. We illustrate the advantages of TAO and TAE designs by detailing two examples of massively parallel arithmetic: construction of complete XOR and addition tables by linear assemblies of DNA tiles. The three helix structures provide flexibility for topological routing of strands in the computation: allowing the implementation of string tile models

CiteSeerX

Caltech Authors

A new tool for the performance analysis of massively parallel computer systems

Author: Alessandra Di Pierro
Anton Stefanek
Gethin Norman
Jeremy Bradley
Richard Hayden
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2010
Field of study

We present a new tool, GPA, that can generate key performance measures for very large systems. Based on solving systems of ordinary differential equations (ODEs), this method of performance analysis is far more scalable than stochastic simulation. The GPA tool is the first to produce higher moment analysis from differential equation approximation, which is essential, in many cases, to obtain an accurate performance prediction. We identify so-called switch points as the source of error in the ODE approximation. We investigate the switch point behaviour in several large models and observe that as the scale of the model is increased, in general the ODE performance prediction improves in accuracy. In the case of the variance measure, we are able to justify theoretically that in the limit of model scale, the ODE approximation can be expected to tend to the actual variance of the model

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals