Search CORE

3 research outputs found

Searching for SNPs with cloud computing

Author: Langmead Ben
Lin Jimmy
Pop Mihai
Salzberg Steven L
Schatz Michael C
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Novel software utilizing cloud computing technology to cost-effectively align and map SNPs from a human genome in three

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland

Parallel methods for short read assembly

Author: Jackson Benjamin Grant
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2009
Field of study

This work is on the parallel de novo assembly of genomic sequences from short sequence reads. With short reads eliminating the reliability of read overlaps in predicting genomic co-location, a revival of graph-based methods has underpinned the development of short-read assemblers. While these methods predate short read technology, their reach has not extended significantly beyond bacterial genomes due to the memory resources required in their use. These memory limitations are exacerbated by the high coverage needed to compensate for shorter read lengths. As a result, prior to our work, short-read de novo assembly had been demonstrated on relatively small genome sizes with a few million bases. In our work, we advance the field of short sequence assembly in a number of ways. First, we extend models and ideas proposed and tested with small genomes on serial machines to large-scale distributed memory parallel machines. Second, we present ideas for assembly that are especially suited to the reconstruction of very large genomes on these machines. Additionally, we present the first assembler that specifically takes advantage a variable number of fragment sizes or insert lengths concurrently when making assembly decisions, while still working well for data with one insertion length

Digital Repository @ Iowa State University (ISU)

High Performance Computing for DNA Sequence Alignment and Assembly

Author: Schatz Michael Christopher
Publication venue
Publication date: 01/01/2010
Field of study

Recent advances in DNA sequencing technology have dramatically increased the scale and scope of DNA sequencing. These data are used for a wide variety of important biological analyzes, including genome sequencing, comparative genomics, transcriptome analysis, and personalized medicine but are complicated by the volume and complexity of the data involved. Given the massive size of these datasets, computational biology must draw on the advances of high performance computing. Two fundamental computations in computational biology are read alignment and genome assembly. Read alignment maps short DNA sequences to a reference genome to discover conserved and polymorphic regions of the genome. Genome assembly computes the sequence of a genome from many short DNA sequences. Both computations benefit from recent advances in high performance computing to efficiently process the huge datasets involved, including using highly parallel graphics processing units (GPUs) as high performance desktop processors, and using the MapReduce framework coupled with cloud computing to parallelize computation to large compute grids. This dissertation demonstrates how these technologies can be used to accelerate these computations by orders of magnitude, and have the potential to make otherwise infeasible computations practical

Digital Repository at the University of Maryland