1,968 research outputs found
Extreme Scale De Novo Metagenome Assembly
Metagenome assembly is the process of transforming a set of short,
overlapping, and potentially erroneous DNA segments from environmental samples
into the accurate representation of the underlying microbiomes's genomes.
State-of-the-art tools require big shared memory machines and cannot handle
contemporary metagenome datasets that exceed Terabytes in size. In this paper,
we introduce the MetaHipMer pipeline, a high-quality and high-performance
metagenome assembler that employs an iterative de Bruijn graph approach.
MetaHipMer leverages a specialized scaffolding algorithm that produces long
scaffolds and accommodates the idiosyncrasies of metagenomes. MetaHipMer is
end-to-end parallelized using the Unified Parallel C language and therefore can
run seamlessly on shared and distributed-memory systems. Experimental results
show that MetaHipMer matches or outperforms the state-of-the-art tools in terms
of accuracy. Moreover, MetaHipMer scales efficiently to large concurrencies and
is able to assemble previously intractable grand challenge metagenomes. We
demonstrate the unprecedented capability of MetaHipMer by computing the first
full assembly of the Twitchell Wetlands dataset, consisting of 7.5 billion
reads - size 2.6 TBytes.Comment: Accepted to SC1
MSPKmerCounter: A Fast and Memory Efficient Approach for K-mer Counting
A major challenge in next-generation genome sequencing (NGS) is to assemble
massive overlapping short reads that are randomly sampled from DNA fragments.
To complete assembling, one needs to finish a fundamental task in many leading
assembly algorithms: counting the number of occurrences of k-mers (length-k
substrings in sequences). The counting results are critical for many components
in assembly (e.g. variants detection and read error correction). For large
genomes, the k-mer counting task can easily consume a huge amount of memory,
making it impossible for large-scale parallel assembly on commodity servers.
In this paper, we develop MSPKmerCounter, a disk-based approach, to
efficiently perform k-mer counting for large genomes using a small amount of
memory. Our approach is based on a novel technique called Minimum Substring
Partitioning (MSP). MSP breaks short reads into multiple disjoint partitions
such that each partition can be loaded into memory and processed individually.
By leveraging the overlaps among the k-mers derived from the same short read,
MSP can achieve astonishing compression ratio so that the I/O cost can be
significantly reduced. For the task of k-mer counting, MSPKmerCounter offers a
very fast and memory-efficient solution. Experiment results on large real-life
short reads data sets demonstrate that MSPKmerCounter can achieve better
overall performance than state-of-the-art k-mer counting approaches.
MSPKmerCounter is available at http://www.cs.ucsb.edu/~yangli/MSPKmerCounte
Research in the effective implementation of guidance computers with large scale arrays Interim report
Functional logic character implementation in breadboard design of NASA modular compute
T-infinity: The Dependency Inversion Principle for Rapid and Sustainable Multidisciplinary Software Development
The CFD Vision 2030 Study recommends that, NASA should develop and maintain an integrated simulation and software development infrastructure to enable rapid CFD technology maturation.... [S]oftware standards and interfaces must be emphasized and supported whenever possible, and open source models for noncritical technology components should be adopted. The current paper presents an approach to an open source development architecture, named T-infinity, for accelerated research in CFD leveraging the Dependency Inversion Principle to realize plugins that communicate through collections of functions without exposing internal data structures. Steady state flow visualization, mesh adaptation, fluid-structure interaction, and overset domain capabilities are demonstrated through compositions of plugins via standardized abstract interfaces without the need for source code dependencies between disciplines. Plugins interact through abstract interfaces thereby avoiding N 2 direct code-to-code data structure coupling where N is the number of codes. This plugin architecture enhances sustainable development by controlling the interaction between components to limit software complexity growth. The use of T-infinity abstract interfaces enables multidisciplinary application developers to leverage legacy applications alongside newly-developed capabilities. While rein, a description of interface details is deferred until the are more thoroughly tested and can be closed to modification
Unified Framework for Finite Element Assembly
At the heart of any finite element simulation is the assembly of matrices and
vectors from discrete variational forms. We propose a general interface between
problem-specific and general-purpose components of finite element programs.
This interface is called Unified Form-assembly Code (UFC). A wide range of
finite element problems is covered, including mixed finite elements and
discontinuous Galerkin methods. We discuss how the UFC interface enables
implementations of variational form evaluation to be independent of mesh and
linear algebra components. UFC does not depend on any external libraries, and
is released into the public domain
The finite element machine: An experiment in parallel processing
The finite element machine is a prototype computer designed to support parallel solutions to structural analysis problems. The hardware architecture and support software for the machine, initial solution algorithms and test applications, and preliminary results are described
Using Rapid Prototyping in Computer Architecture Design Laboratories
This paper describes the undergraduate computer architecture courses and laboratories introduced at Georgia Tech during the past two years. A core sequence of six required courses for computer engineering students has been developed. In this paper, emphasis is placed upon the new core laboratories which utilize commercial CAD tools, FPGAs, hardware emulators, and a VHDL based rapid prototyping approach to simulate, synthesize, and implement prototype computer hardware
- …