Search CORE

176 research outputs found

Recommended from our members

GenEpi: gene-based epistasis discovery using machine learning.

Author: Alzheimer’s Disease Neuroimaging Initiative
Chang Yu-Chuan
Chen Chien-Yu
Giacomini Kathleen M
Hong Ming-Yi
Hsieh Ping-Han
Oyang Yen-Jen
Tung Yi-An
Wu June-Tai
Yee Sook Wah
Publication venue: eScholarship, University of California
Publication date: 01/02/2020
Field of study

BackgroundGenome-wide association studies (GWAS) provide a powerful means to identify associations between genetic variants and phenotypes. However, GWAS techniques for detecting epistasis, the interactions between genetic variants associated with phenotypes, are still limited. We believe that developing an efficient and effective GWAS method to detect epistasis will be a key for discovering sophisticated pathogenesis, which is especially important for complex diseases such as Alzheimer's disease (AD).ResultsIn this regard, this study presents GenEpi, a computational package to uncover epistasis associated with phenotypes by the proposed machine learning approach. GenEpi identifies both within-gene and cross-gene epistasis through a two-stage modeling workflow. In both stages, GenEpi adopts two-element combinatorial encoding when producing features and constructs the prediction models by L1-regularized regression with stability selection. The simulated data showed that GenEpi outperforms other widely-used methods on detecting the ground-truth epistasis. As real data is concerned, this study uses AD as an example to reveal the capability of GenEpi in finding disease-related variants and variant interactions that show both biological meanings and predictive power.ConclusionsThe results on simulation data and AD demonstrated that GenEpi has the ability to detect the epistasis associated with phenotypes effectively and efficiently. The released package can be generalized to largely facilitate the studies of many complex diseases in the near future

eScholarship - University of California

Bioinformatics challenges for genome-wide association studies

Author: Ahmed
Altshuler
Amundadottir
Askland
Bureau
Bush
Calle
Chang
Chanock
Cook
Culverhouse
Donnelly
Easton
Eiberg
Elbers
Emily
F. W. Asselbergs
Greene
Hahn
Hahn
Hirschhorn
Holmans
Infante
J. H. Moore
Jakobsdottir
Kooperberg
Kraft
Lewontin
Lou
Lunetta
Manolio
Manolio
Marchini
McKinney
McKinney
Mei
Millstein
Moore
Moore
Moore
Moore
Moore
Moore
Moore
Moore
Moore
Moore
Motsinger
Namkung
Nelson
Pan
Pattin
Reich
Reif
Ripperger
Ritchie
Ritchie
Ritchie
S. M. Williams
Schork
Sinnott-Armstrong
Spencer
Thornton-Wells
Torkamani
Velez
Wang
Wilke
Williams
Wongseree
Yu
Yu
Zhang
Publication venue: Oxford University Press
Publication date: 15/02/2010
Field of study

Motivation: The sequencing of the human genome has made it possible to identify an informative set of >1 million single nucleotide polymorphisms (SNPs) across the genome that can be used to carry out genome-wide association studies (GWASs). The availability of massive amounts of GWAS data has necessitated the development of new biostatistical methods for quality control, imputation and analysis issues including multiple testing. This work has been successful and has enabled the discovery of new associations that have been replicated in multiple studies. However, it is now recognized that most SNPs discovered via GWAS have small effects on disease susceptibility and thus may not be suitable for improving health care through genetic testing. One likely explanation for the mixed results of GWAS is that the current biostatistical analysis paradigm is by design agnostic or unbiased in that it ignores all prior knowledge about disease pathobiology. Further, the linear modeling framework that is employed in GWAS often considers only one SNP at a time thus ignoring their genomic and environmental context. There is now a shift away from the biostatistical approach toward a more holistic approach that recognizes the complexity of the genotype–phenotype relationship that is characterized by significant heterogeneity and gene–gene and gene–environment interaction. We argue here that bioinformatics has an important role to play in addressing the complexity of the underlying genetic basis of common human diseases. The goal of this review is to identify and discuss those GWAS challenges that will require computational methods

CiteSeerX

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

PubMed Central

UCL Discovery

Dissertations of the University of Groningen

The choice of null distributions for detecting gene-gene interactions in genome-wide association studies

Author: A Niu
B Efron
B Med
C Greene
C Greene
C Herold
C Yang
C Yang
Can Yang
D Balding
D Evans
E Eichler
H Cordell
Hong Xue
J Marchini
J Moore
J Moore
K Kira
L Wiskott
M Nelson
M Park
M Ritchie
PC Phillips
Qiang Yang
R Culverhouse
R Klein
R Tibshirani
S Dudoit
S Dudoit
S Purcell
T Hastie
T Hastie
T Wu
T Zheng
W Li
Weichuan Yu
WTCCC
X Chen
X Wan
X Wan
Xiang Wan
Y Benjamini
Y Zhang
Zengyou He
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Identification of gene-gene interactions for Alzheimer's disease using co-operative game theory

Author: Vardarajan Badri N.
Publication venue: Boston University
Publication date: 01/01/2013
Field of study

Thesis (Ph.D.)--Boston UniversityThe multifactorial nature of Alzheimer's Disease suggests that complex gene-gene interactions are present in AD pathways. Contemporary approaches to detect such interactions in genome-wide data are mathematically and computationally challenging. We investigated gene-gene interactions for AD using a novel algorithm based on cooperative game theory in 15 genome-wide association study (GWAS) datasets comprising of a total of 11,840 AD cases and 10,931 cognitively normal elderly controls from the Alzheimer Disease Genetics Consortium (ADGC). We adapted this approach, which was developed originally for solving multi-dimensional problems in economics and social sciences, to compute a Shapely value statistic to identify genetic markers that contribute most to coalitions of SNPs in predicting AD risk. Treating each GWAS dataset as independent discovery, markers were ranked according to their contribution to coalitions formed with other markers. Using a backward elimination strategy, markers with low Shapley values were eliminated and the statistic was recalculated iteratively. We tested all two-way interactions between top Shapley markers in regression models which included the two SNPs (main effects) and a term for their interaction. Models yielding a p-value<0.05 for the interaction term were evaluated in each of the other datasets and the results from all datasets were combined by meta-analysis. Statistically significant interactions were observed with multiple marker combinations in the APOE regions. My analyses also revealed statistically strong interactions between markers in 6 regions; CTNNA3-ATP11A (p=4.1E-07), CSMD1-PRKCQ (p=3.5E-08), DCC-UNC5CL (p=5.9e-8), CNTNAP2-RFC3 (p=1.16e-07), AACS-TSHZ3 (p=2.64e-07) and CAMK4-MMD (p=3.3e-07). The Shapley value algorithm outperformed Chi-Square and ReliefF in detecting known interactions between APOE and GAB2 in a previously published GWAS dataset. It was also more accurate than competing filtering methods in identifying simulated epistastic SNPs that are additive in nature, but its accuracy was low in identifying non-linear interactions. The game theory algorithm revealed strong interactions between markers in novel genes with weak main effects, which would have been overlooked if only markers with strong marginal association with AD were tested. This method will be a valuable tool for identifying gene-gene interactions for complex diseases and other traits

Boston University Institutional Repository (OpenBU)

Uncovering metabolic pathways relevant to phenotypic traits of microbial genomes

Author: Gasteiger Johann
Kastenmüller Gabi
Mewes Hans-Werner
Schenk Maria Elisabeth
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

A new machine learning-based method is presented here for the identification of metabolic pathways related to specific phenotypes in multiple microbial genomes

Crossref

PubMed Central

PuSH