Search CORE

1,149 research outputs found

High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies.

Author: Abedini Mani
Goudey Benjamin
Hopper John L
Inouye Michael
Makalic Enes
Reumann Matthias
Schmidt Daniel F
Wagner John
Zhou Zeyu
Zobel Justin
Publication venue: Health Inf Sci Syst
Publication date: 01/01/2015
Field of study

Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.This research was partially funded by NHMRC grant 1033452 and was supported by a Victorian Life Sciences Computation Initiative (VLSCI) grant number 0126 on its Peak Computing Facility at the University of Melbourne, an initiative of the Victorian Government, Australia

PubMed Central

Apollo (Cambridge)

High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies

Author
Publication venue: BioMed Central
Publication date
Field of study

Springer - Publisher Connector

Discovering Higher-order SNP Interactions in High-dimensional Genomic Data

Author: Uppu Suneetha
Publication venue: Curtin University
Publication date: 01/01/2018
Field of study

In this thesis, a multifactor dimensionality reduction based method on associative classification is employed to identify higher-order SNP interactions for enhancing the understanding of the genetic architecture of complex diseases. Further, this thesis explored the application of deep learning techniques by providing new clues into the interaction analysis. The performance of the deep learning method is maximized by unifying deep neural networks with a random forest for achieving reliable interactions in the presence of noise

espace@Curtin

Recommended from our members

Association Analysis of Additive Effects and Epistasis Between Human Candidate Malaria Protective Genes

Author: Ndia Carolyne Mukami
Publication venue
Publication date: 01/01/2015
Field of study

Malaria is a major cause of childhood death in Africa and host genetic factors play a key role in determining survival from this disease. Although many candidate loci have been identified, there have been difficulties in confirming the significance of some of these loci. To some extent this might be explained by the added complexity of epistasis, or gene-gene interactions. Through this thesis I aimed: (1) to re-appraise a range of candidate malaria-association genes using a large-scale case-control study of severe malaria (SM) in Kilifi, Kenya; (2) to compare different approaches to detecting epistatic interactions; (3) to look for evidence of epistasis between candidate genes in my data set; (4) to examine the haplotype structure and linkage disequilibrium (LD) patterns for two such implicated variants (HbS and α+thalassaemia) and their gene regions, that coexist in the Kilifi population, and (5) to use these exemplars as a starting point for investigating the process of detecting epistasis in SM in a genome-wide association study (GWAS). Out of 71 candidate genes investigated, I observed that polymorphisms affecting various aspects of red blood cells (including HBB, HBA, G6PD, FREM3, INPP4B, ATP2B4 and ABO) were among those associated with the strongest signals of differential susceptibility to SM. Because of their prominence in malaria, HbS and α+thalassaemia were used to illustrate interaction analysis at the GWAS level. This included looking at the structure of the genomic regions surrounding the genes. As expected, a single haplotype of approximately 200kb was seen surrounding HbS, which then diverged into 2 major haplotypes spanning a further 1Mb either side, an observation that was largely explained by ethnicity. In contrast, no marked LD/haplotype structure was observed in the genomic region surrounding the α+thalassaemia deletion, suggesting that this is a very old polymorphism. Through this study, I confirmed the negative epistasis seen between HbS and α+thalassaemia using a study design (case-control) that was different to that used previously (cohort), although this was not among the most significant of the interactions I detected. I searched for pairwise interactions between these two polymorphisms at a genome wide level using heterozygous and additive models for HbS and α+thalassaemia respectively. For each scan a single region reaching a significance level of -7 was found (STX18 for HbS and MYEOV for α+thalassaemia), plus several other novel signals were identified in the 10-6 to 10-7 significance region. Further work will be required to validate these signals and the challenge will be to try and understand their biological relevance. This is now becoming possible with datasets in many diseases, including malaria, being released into the public domain. But, as this Kenyan study has shown, having large group sizes, high quality clinical and genetic data, it is possible to begin to explore genetic interactions in a disease setting

Open Research Online (The Open University)

SynTView — an interactive multi-view genome browser for next-generation comparative microorganism genomics

Author: A Herbig
A Louis
A Petkau
A Sboner
AE Darling
AU Sinha
BD Ondov
BS Pedersen
CB Nielsen
CT Lopes
CU Köser
DR Riley
Erika Souche
H Thorvaldsdottir
H Wang
I Uchiyama
Ivan Moszer
JR Grant
KA Frazer
KV Revanna
M Fiume
M Krzywinski
M Meyer
MJ Sullivan
NF Alikhan
NJ Croucher
NJ Loman
P Lechat
Pierre Lechat
R Kerkhoven
S Baker
SF Altschul
SR Harris
T Abeel
T Carver
T Carver
T Carver
T Vesth
T Yamada
X Pan
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Design and Implementation of a Computational Platform and a Parallelized Interaction Analysis for Large Scale Genomics Data in Multiple Sclerosis

Author: &lt
Daniel Uvehag
Publication venue
Publication date: 24/04/2020
Field of study

Abstract The multiple sclerosis (MS) genetics research group led by professor Jan Hillert at Karolinska Institutet, focuses on investigating the aetiology of the disease. Samples have been collected routinely from patients visiting the clinic for decades. From these samples, large amounts of genetics data is being generated. The traditional methods of analyzing the data is becoming increasingly inefficient as data sets grow larger. New approaches are needed to perform the analyses. This thesis gives an introduction to the relevant genetics and discusses possible approaches for enabling more efficient execution of legacy analysis tools, as well as improving a gene-environment and gene-gene interaction analysis. Different computational paradigms are presented followed by the implementation of a computational platform to support the researchers' existing, and possibly future, analysis needs. The improved interaction analysis application is then implemented and executed in a virtual instance of this platform. The performance of the analysis application is then evaluated with respect to the original reference application. Referat Design och implementation av beräkningsplattform och paralelliserad interaktionsanalys för storskaliga genetiska data inom multipel skleros Professor Jan Hillert vid Karolinska Institutet leder en forskargrupp som fokuserar på etiologin bakom multipel skleros (MS). Under flera årtionden har patientprover samlats in från kliniken och från dessa prover har stora mängder genetiska data genererats. De traditionella analysmetoderna blir allt mer ineffektiva då datamängderna öker. Det finns ett stort behov av nya tillvägagångssätt och metoder för att analysera dessa data. Denna uppsats ger en introduktion i relevant genetik och diskuterar olika tillvägagångssätt för att möjliggöra effektivare exekvering av befintliga analysverktyg, så väl som förbättring av en gen-miljö och gen-gen-interaktionsanalys. Olika etablerade beräkningsparadigmer presenteras, följt av en implementation av en beräkningsplattform som ett stöd i att tillgodose forskargruppens nuvarande och möjli-ga framtida behov. Den förbättrade interaktionsanalysen är sedan implementerad och exekverad i en virtuell instans av plattformen. Interaktionsanalysens prestanda utvärderas sedan och jämförs med ursprungsimplementationen

CiteSeerX