4,238 research outputs found

    A heterogeneous compute solution for optimized genomic selection analysis

    Get PDF
    This paper presents a heterogeneous computing solution for an optimized genetic selection analysis tool, GenSel. GenSel can be used to efficiently infer the effects of genetic markers on a desired trait or to determine the genomic estimated breeding values (GEBV) of genotyped individuals. To predict which genetic markers are informational, GenSel performs Bayesian inference using Gibbs sampling, a Markov Chain Monte Carlo (MCMC) algorithm. Parallelizing this algorithm proves to be a technically challenging problem because there exists a loop carried dependence between each iteration of the Markov chain. The approach presented in this paper exploits both task-level parallelism (TLP) and data-level parallelism (DLP) that exists within each iteration of the Markov chain. More specifically, a combination of CPU threads using OpenMP and GPU threads using NVIDIA\u27s CUDA paradigm is implemented to speed up the sampling of each genetic marker used in creating the model. Performance speedup will allow this algorithm to accommodate the expected increase in observations on animals and genetic markers per observation. The current implementation executes 1.84 times faster than the optimized CPU implementation

    ReGen: Optimizing Genetic Selection Algorithms for Heterogeneous Computing

    Get PDF
    GenSel is a genetic selection analysis tool used to determine which genetic markers are informational for a given trait. Performing genetic selection related analyses is a time consuming and computationally expensive task. Due to an expected increase in the number of genotyped individuals, analysis times will increase dramatically. Therefore, optimization efforts must be made to keep analysis times reasonable. This thesis focuses on optimizing one of GenSel’s underlying algorithms for heterogeneous computing. The resulting algorithm exposes task-level parallelism and data-level parallelism present but inaccessible in the original algorithm. The heterogeneous computing solution, ReGen, outperforms the optimized CPU implementation achieving a 1.84 times speedup

    Packaging signals in single-stranded RNA viruses: nature’s alternative to a purely electrostatic assembly mechanism

    Get PDF
    The formation of a protective protein container is an essential step in the life-cycle of most viruses. In the case of single-stranded (ss)RNA viruses, this step occurs in parallel with genome packaging in a co-assembly process. Previously, it had been thought that this process can be explained entirely by electrostatics. Inspired by recent single-molecule fluorescence experiments that recapitulate the RNA packaging specificity seen in vivo for two model viruses, we present an alternative theory, which recognizes the important cooperative roles played by RNA–coat protein interactions, at sites we have termed packaging signals. The hypothesis is that multiple copies of packaging signals, repeated according to capsid symmetry, aid formation of the required capsid protein conformers at defined positions, resulting in significantly enhanced assembly efficiency. The precise mechanistic roles of packaging signal interactions may vary between viruses, as we have demonstrated for MS2 and STNV. We quantify the impact of packaging signals on capsid assembly efficiency using a dodecahedral model system, showing that heterogeneous affinity distributions of packaging signals for capsid protein out-compete those of homogeneous affinities. These insights pave the way to a new anti-viral therapy, reducing capsid assembly efficiency by targeting of the vital roles of the packaging signals, and opens up new avenues for the efficient construction of protein nanocontainers in bionanotechnology

    A Multi-GPU Compute Solution for Optimized Genomic Selection Analysis

    Get PDF
    Many modern-day Bioinformatics algorithms rely heavily on statistical models to analyze their biological data. Some of these statistical models lend themselves nicely to standard high performance computing optimizations such as parallelism, while others do not. One such algorithm is Markov Chain Monte Carlo (MCMC). In this thesis, we present a heterogeneous compute solution for optimizing GenSel, a genetic selection analysis tool. GenSel utilizes a MCMC algorithm to perform Bayesian inference using Gibbs sampling. Optimizing an MCMC algorithm is a difficult problem because it is inherently sequential, containing a loop carried dependence between each Markov Chain iteration. The optimization presented in this thesis utilizes GPU computing to exploit the data-level parallelism within each of these iterations. In addition, it allows for the efficient management of memory, the pipelining of CUDA kernels, and the use of multiple GPUs. The optimizations presented show performance improvements of up to 1.84 times that of the original algorithm

    From access and integration to mining of secure genomic data sets across the grid

    Get PDF
    The UK Department of Trade and Industry (DTI) funded BRIDGES project (Biomedical Research Informatics Delivered by Grid Enabled Services) has developed a Grid infrastructure to support cardiovascular research. This includes the provision of a compute Grid and a data Grid infrastructure with security at its heart. In this paper we focus on the BRIDGES data Grid. A primary aim of the BRIDGES data Grid is to help control the complexity in access to and integration of a myriad of genomic data sets through simple Grid based tools. We outline these tools, how they are delivered to the end user scientists. We also describe how these tools are to be extended in the BBSRC funded Grid Enabled Microarray Expression Profile Search (GEMEPS) to support a richer vocabulary of search capabilities to support mining of microarray data sets. As with BRIDGES, fine grain Grid security underpins GEMEPS
    • …
    corecore