Search CORE

69 research outputs found

GPU-accelerated subgraph enumeration on partitioned graphs

Author: GUO Wentian
HE Bingsheng
LI Yuchen
SHA Mo
TAN Kian-Lee
XIAO Xiaokui
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2020
Field of study

Ministry of Education, Singapore under its Academic Research Funding Tier

Homology sequence analysis using GPU acceleration

Author: Truong Huan
Publication venue: 'University of Missouri Libraries'
Publication date
Field of study

A number of problems in bioinformatics, systems biology and computational biology field require abstracting physical entities to mathematical or computational models. In such studies, the computational paradigms often involve algorithms that can be solved by the Central Processing Unit (CPU). Historically, those algorithms benefit from the advancements of computing power in the serial processing capabilities of individual CPU cores. However, the growth has slowed down over recent years, as scaling out CPU has been shown to be both cost-prohibitive and insecure. To overcome this problem, parallel computing approaches that employ the Graphics Processing Unit (GPU) have gained attention as complementing or replacing traditional CPU approaches. The premise of this research is to investigate the applicability of various parallel computing platforms to several problems in the detection and analysis of homology in biological sequence. I hypothesize that by exploiting the sheer amount of computation power and sequencing data, it is possible to deduce information from raw sequences without supplying the underlying prior knowledge to come up with an answer. I have developed such tools to perform analysis at scales that are traditionally unattainable with general-purpose CPU platforms. I have developed a method to accelerate sequence alignment on the GPU, and I used the method to investigate whether the Operational Taxonomic Unit (OTU) classification problem can be improved with such sheer amount of computational power. I have developed a method to accelerate pairwise k-mer comparison on the GPU, and I used the method to further develop PolyHomology, a framework to scaffold shared sequence motifs across large numbers of genomes to illuminate the structure of the regulatory network in yeasts. The results suggest that such approach to heterogeneous computing could help to answer questions in biology and is a viable path to new discoveries in the present and the future.Includes bibliographical reference

University of Missouri: MOspace

DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI

Author: B Langmead
B Schmidt
Bertil Schmidt
BH Bloom
Douglas L Maskell
DR Zerbino
E Lindholm
EW Myers
H Shi
H Shi
J Butler
J Nickolls
J Schröder
JC Dohm
JT Simpson
L Fan
L Salmela
MJ Chaisson
P Havlak
PA Pevzner
R Li
RL Warren
S Batzoglou
WR Jeck
X Huang
Y Liu
Y Liu
Y Liu
Y Liu
Yongchao Liu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Next-generation sequencing technologies have led to the high-throughput production of sequence data (reads) at low cost. However, these reads are significantly shorter and more error-prone than conventional Sanger shotgun reads. This poses a challenge for the <it>de novo </it>assembly in terms of assembly quality and scalability for large-scale short read datasets. Results We present DecGPU, the first parallel and distributed error correction algorithm for high-throughput short reads (HTSRs) using a hybrid combination of CUDA and MPI parallel programming models. DecGPU provides CPU-based and GPU-based versions, where the CPU-based version employs coarse-grained and fine-grained parallelism using the MPI and OpenMP parallel programming models, and the GPU-based version takes advantage of the CUDA and MPI parallel programming models and employs a hybrid CPU+GPU computing model to maximize the performance by overlapping the CPU and GPU computation. The distributed feature of our algorithm makes it feasible and flexible for the error correction of large-scale HTSR datasets. Using simulated and real datasets, our algorithm demonstrates superior performance, in terms of error correction quality and execution speed, to the existing error correction algorithms. Furthermore, when combined with Velvet and ABySS, the resulting DecGPU-Velvet and DecGPU-ABySS assemblers demonstrate the potential of our algorithm to improve <it>de novo </it>assembly quality for <it>de</it>-<it>Bruijn</it>-graph-based assemblers. Conclusions DecGPU is publicly available open-source software, written in CUDA C++ and MPI. The experimental results suggest that DecGPU is an effective and feasible error correction algorithm to tackle the flood of short reads produced by next-generation sequencing technologies.</p

Springer - Publisher Connector

Directory of Open Access Journals

FisherMP: fully parallel algorithm for detecting combinatorial motifs from large ChIP-seq datasets.

Author: Chen Yong
Liang Ying
Su Zhengchang
Wang Xiangyun
Zhang Shaoqiang
Publication venue: Rowan Digital Works
Publication date: 01/06/2019
Field of study

Detecting binding motifs of combinatorial transcription factors (TFs) from chromatin immunoprecipitation sequencing (ChIP-seq) experiments is an important and challenging computational problem for understanding gene regulations. Although a number of motif-finding algorithms have been presented, most are either time consuming or have sub-optimal accuracy for processing large-scale datasets. In this article, we present a fully parallelized algorithm for detecting combinatorial motifs from ChIP-seq datasets by using Fisher combined method and OpenMP parallel design. Large scale validations on both synthetic data and 350 ChIP-seq datasets from the ENCODE database showed that FisherMP has not only super speeds on large datasets, but also has high accuracy when compared with multiple popular methods. By using FisherMP, we successfully detected combinatorial motifs of CTCF, YY1, MAZ, STAT3 and USF2 in chromosome X, suggesting that they are functional co-players in gene regulation and chromosomal organization. Integrative and statistical analysis of these TF-binding peaks clearly demonstrate that they are not only highly coordinated with each other, but that they are also correlated with histone modifications. FisherMP can be applied for integrative analysis of binding motifs and for predicting cis-regulatory modules from a large number of ChIP-seq datasets

Extracting the Structure and Conformations of Biological Entities from Large Datasets

Author: Dashti Ali
Publication venue: UWM Digital Commons
Publication date: 01/12/2013
Field of study

In biology, structure determines function, which often proceeds via changes in conformation. Efficient means for determining structure exist, but mapping conformations continue to present a serious challenge. Single-particles approaches, such as cryogenic electron microscopy (cryo-EM) and emerging diffract & destroy X-ray techniques are, in principle, ideally positioned to overcome these challenges. But the algorithmic ability to extract information from large heterogeneous datasets consisting of unsorted snapshots - each emanating from an unknown orientation of an object in an unknown conformation - remains elusive. It is the objective of this thesis to describe and validate a powerful suite of manifold-based algorithms able to extract structural and conformational information from large datasets. These computationally efficient algorithms offer a new approach to determining the structure and conformations of viruses and macromolecules. After an introduction, we demonstrate a distributed, exact k-Nearest Neighbor Graph (k-NNG) construction method, in order to establish a firm algorithmic basis for manifold-based analysis. The proposed algorithm uses Graphics Processing Units (GPUs) and exploits multiple levels of parallelism in distributed computational environment and it is scalable for different cluster sizes, with each compute node in the cluster containing multiple GPUs. Next, we present applications of manifold-based analysis in determining structure and conformational variability. Using the Diffusion Map algorithm, a new approach is presented, which is capable of determining structure of symmetric objects, such as viruses, to 1/100th of the object diameter, using low-signal diffraction snapshots. This is demonstrated by means of a successful 3D reconstruction of the Satellite Tobacco Necrosis Virus (STNV) to atomic resolution from simulated diffraction snapshots with and without noise. We next present a new approach for determining discrete conformational changes of the enzyme Adenylate kinase (ADK) from very large datasets of up to 20 million snapshots, each with ~104 pixels. This exceeds by an order of magnitude the largest dataset previously analyzed. Finally, we present a theoretical framework and an algorithmic pipeline for capturing continuous conformational changes of the ribosome from ultralow-signal (-12dB) experimental cryo-EM. Our analysis shows a smooth, concerted change in molecular structure in two-dimensional projection, which might be indicative of the way the ribosome functions as a molecular machine. The thesis ends with a summary and future prospects

GPrimer: a fast GPU-based pipeline for primer design for qPCR experiments

Author: Bae Jeongmin
Jeon Hajin
Kim Min-Soo
Publication venue: BioMed Central
Publication date: 01/04/2021
Field of study

Background: Design of valid high-quality primers is essential for qPCR experiments. MRPrimer is a powerful pipeline based on MapReduce that combines both primer design for target sequences and homology tests on off-target sequences. It takes an entire sequence DB as input and returns all feasible and valid primer pairs existing in the DB. Due to the effectiveness of primers designed by MRPrimer in qPCR analysis, it has been widely used for developing many online design tools and building primer databases. However, the computational speed of MRPrimer is too slow to deal with the sizes of sequence DBs growing exponentially and thus must be improved. Results: We develop a fast GPU-based pipeline for primer design (GPrimer) that takes the same input and returns the same output with MRPrimer. MRPrimer consists of a total of seven MapReduce steps, among which two steps are very time-consuming. GPrimer significantly improves the speed of those two steps by exploiting the computational power of GPUs. In particular, it designs data structures for coalesced memory access in GPU and workload balancing among GPU threads and copies the data structures between main memory and GPU memory in a streaming fashion. For human RefSeq DB, GPrimer achieves a speedup of 57 times for the entire steps and a speedup of 557 times for the most time-consuming step using a single machine of 4 GPUs, compared with MRPrimer running on a cluster of six machines. Conclusions: We propose a GPU-based pipeline for primer design that takes an entire sequence DB as input and returns all feasible and valid primer pairs existing in the DB at once without an additional step using BLAST-like tools. The software is available at https://github.com/qhtjrmin/GPrimer.git. © 2021, The Author(s).1

DGIST Library Institutional Repository

High Performance Computing Algorithms for Accelerating Peptide Identification from Mass-Spectrometry Data Using Heterogeneous Supercomputers

Author: Haseeb Muhammad
Saeed Fahad, ed.
Publication venue: FIU Digital Commons
Publication date: 01/01/2023
Field of study

Fast and accurate identification of peptides and proteins from the mass spectrometry (MS) data is a critical problem in modern systems biology. Database peptide search is the most commonly used computational method to identify peptide sequences from the MS data. In this method, giga-bytes of experimentally generated MS data are compared against tera-byte sized databases of theoretically simulated MS data resulting in a compute- and data-intensive problem requiring days or weeks of computational times on desktop machines. Existing serial and high performance computing (HPC) algorithms strive to accelerate and improve the computational efficiency of the search, but exhibit sub-optimal performances due to their inefficient parallelization models, low resource utilization and high overhead costs