Search CORE

Springer - Publisher Connector

Wageningen University & Research Publications

Serial Testing for Detection of Multilocus Genetic Interactions

Author: Al-Khaledi Zaid T.
Publication venue: UKnowledge
Publication date: 01/01/2019
Field of study

A method to detect relationships between disease susceptibility and multilocus genetic interactions is the Multifactor-Dimensionality Reduction (MDR) technique pioneered by Ritchie et al. (2001). Since its introduction, many extensions have been pursued to deal with non-binary outcomes and/or account for multiple interactions simultaneously. Studying the effects of multilocus genetic interactions on continuous traits (blood pressure, weight, etc.) is one case that MDR does not handle. Culverhouse et al. (2004) and Gui et al. (2013) proposed two different methods to analyze such a case. In their research, Gui et al. (2013) introduced the Quantitative Multifactor-Dimensionality Reduction (QMDR) that uses the overall average of response variable to classify individuals into risk groups. The classification mechanism may not be efficient under some circumstances, especially when the overall mean is close to some multilocus means. To address such difficulties, we propose a new algorithm, the Ordered Combinatorial Quantitative Multifactor-Dimensionality Reduction (OQMDR), that uses a series of testings, based on ascending order of multilocus means, to identify best interactions of different orders with risk patterns that minimize the prediction error. Ten-fold cross-validation is used to choose from among the resulting models. Regular permutations testings are used to assess the significance of the selected model. The assessment procedure is also modified by utilizing the Generalized Extreme-Value distribution to enhance the efficiency of the evaluation process. We presented results from a simulation study to illustrate the performance of the algorithm. The proposed algorithm is also applied to a genetic data set associated with Alzheimer\u27s Disease

University of Kentucky

Information-theoretic gene-gene and gene-environment interaction analysis of quantitative traits

Author: Chanda Pritam
Liu Song
Ramanathan Murali
Sucheston Lara
Zhang Aidong
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The purpose of this research was to develop a novel information theoretic method and an efficient algorithm for analyzing the gene-gene (GGI) and gene-environmental interactions (GEI) associated with quantitative traits (QT). The method is built on two information-theoretic metrics, the <it>k</it>-way interaction information (KWII) and phenotype-associated information (PAI). The PAI is a novel information theoretic metric that is obtained from the total information correlation (TCI) information theoretic metric by removing the contributions for inter-variable dependencies (resulting from factors such as linkage disequilibrium and common sources of environmental pollutants). Results The KWII and the PAI were critically evaluated and incorporated within an algorithm called CHORUS for analyzing QT. The combinations with the highest values of KWII and PAI identified each known GEI associated with the QT in the simulated data sets. The CHORUS algorithm was tested using the simulated GAW15 data set and two real GGI data sets from QTL mapping studies of high-density lipoprotein levels/atherosclerotic lesion size and ultra-violet light-induced immunosuppression. The KWII and PAI were found to have excellent sensitivity for identifying the key GEI simulated to affect the two quantitative trait variables in the GAW15 data set. In addition, both metrics showed strong concordance with the results of the two different QTL mapping data sets. Conclusion The KWII and PAI are promising metrics for analyzing the GEI of QT.</p

Springer - Publisher Connector

Public Library of Science (PLOS)

High-Order SNP Combinations Associated with Complex Diseases: Efficient Discovery, Statistical Power and Functional Interactions

Author: A Gao
A Motsinger-Reif
A Subramanian
B Maher
B Van Ness
Brian Van Ness
C Greene
C Greene
C Herold
C Huttenhower
D Anastassiou
D Brinza
D Evans
D Goldstein
D Rabinowitz
D Stram
E Bey
E Eichler
E Schadt
G Dong
G Fang
G Fang
G Grahne
G Thorisson
Gang Fang
H Cordell
H He
H Wang
Haoyu Yu
J Hirschhorn
J Huang
J Lehár
J Marchini
J Moore
J Storey
J Storey
K Christensen
K Pattin
K Small
K Van Steen
K Wang
K Wang
L Cardon
L Ma
L Tentori
M Ashburner
M Carrasquillo
M Costanzo
M Nelson
M Norris
M Ritchie
M Steinbach
M Van Der Deen
Majda Haznadar
Michael Steinbach
N Yosef
P Kraft
R Agrawal
R Bayardo
R Cantor
R Dowell
R Gupta
S Baranzini
S Bay
S Purcell
S Vicent
T Church
T Church
T Howard
T Kam-Thong
T Manolio
Timothy R. Church
V Varadan
V Varadan
Vipin Kumar
W Zhang
Wen Wang
William S. Oetting
X Hua
X Lou
X Lou
X Wan
X Zhang
Y Oji
Y Zhang
Yu Zhang
Z Wang
Publication venue: Public Library of Science
Publication date: 19/04/2012
Field of study

There has been increased interest in discovering combinations of single-nucleotide polymorphisms (SNPs) that are strongly associated with a phenotype even if each SNP has little individual effect. Efficient approaches have been proposed for searching two-locus combinations from genome-wide datasets. However, for high-order combinations, existing methods either adopt a brute-force search which only handles a small number of SNPs (up to few hundreds), or use heuristic search that may miss informative combinations. In addition, existing approaches lack statistical power because of the use of statistics with high degrees-of-freedom and the huge number of hypotheses tested during combinatorial search. Due to these challenges, functional interactions in high-order combinations have not been systematically explored. We leverage discriminative-pattern-mining algorithms from the data-mining community to search for high-order combinations in case-control datasets. The substantially improved efficiency and scalability demonstrated on synthetic and real datasets with several thousands of SNPs allows the study of several important mathematical and statistical properties of SNP combinations with order as high as eleven. We further explore functional interactions in high-order combinations and reveal a general connection between the increase in discriminative power of a combination over its subsets and the functional coherence among the genes comprising the combination, supported by multiple datasets. Finally, we study several significant high-order combinations discovered from a lung-cancer dataset and a kidney-transplant-rejection dataset in detail to provide novel insights on the complex diseases. Interestingly, many of these associations involve combinations of common variations that occur in small fractions of population. Thus, our approach is an alternative methodology for exploring the genetics of rare diseases for which the current focus is on individually rare variations

Springer - Publisher Connector

FigShare

Multilocus analysis of SNP and metabolic data within a given pathway

Author: Børresen-Dale Anne-Lise
Faldaas Anne
Fjeldstad Ståle
Geisler Jurgen
Grenaker Grethe Irene
Kristensen Vessela N
Lingjærde Ole Christian
Lønning Per Eystein
Tsalenko Anya
Yakhini Zohar
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Complex traits, which are under the influence of multiple and possibly interacting genes, have become a subject of new statistical methodological research. One of the greatest challenges facing human geneticists is the identification and characterization of susceptibility genes for common multifactorial diseases and their association to different quantitative phenotypic traits. RESULTS: Two types of data from the same metabolic pathway were used in the analysis: categorical measurements of 18 SNPs; and quantitative measurements of plasma levels of several steroids and their precursors. Using the combinatorial partitioning method we tested various thresholds for each metabolic trait and each individual SNP locus. One SNP in CYP19, 3UTR, two SNPs in CYP1B1 (R48G and A119S) and one in CYP1A1 (T461N) were significantly differently distributed between the high and low level metabolic groups. The leave one out cross validation method showed that 6 SNPs in concert make 65% correct prediction of phenotype. Further we used pattern recognition, computing the p-value by Monte Carlo simulation to identify sets of SNPs and physiological characteristics such as age and weight that contribute to a given metabolic level. Since the SNPs detected by both methods reside either in the same gene (CYP1B1) or in 3 different genes in immediate vicinity on chromosome 15 (CYP19, CYP11 and CYP1A1) we investigated the possibility that they form intragenic and intergenic haplotypes, which may jointly account for a higher activity in the pathway. We identified such haplotypes associated with metabolic levels. CONCLUSION: The methods reported here may enable to study multiple low-penetrance genetic factors that together determine various quantitative phenotypic traits. Our preliminary data suggest that several genes coding for proteins involved in a common pathway, that happen to be located on common chromosomal areas and may form intragenic haplotypes, together account for a higher activity of the whole pathway

University of Bergen

NORA - Norwegian Open Research Archives

Clique-Finding for Heterogeneity and Multidimensionality in Biomarker Epidemiology Research: The CHAMBER Algorithm

Author: Aaron Kershenbaum
AS Foulkes
B Strom
D Brinza
D Erlenkotter
D Michie
DV Conti
E Lander
ER Hauser
H Thomas
I Kononenko
I Ruczinski
I Witten
J Chen
J Friedman
J Friedman
J Hoh
J Huang
J Lepre
J Moore
JG Liehr
Jonatan R. Ruiz
JR Quinlan
K Kira
L Breiman
MD Ritchie
MR Nelson
MY Park
N Tahri-Daizadeh
N Tahri-Daizadeh
NJ Schork
P Jaccard
R Mushlin
R Schapire
Richard A. Mushlin
Stephen Gallagher
TA Thornton-Wells
Timothy R. Rebbeck
TR Rebbeck
TR Rebbeck
TR Rebbeck
V Cortessis
V Vapnik
WD Shannon
Y Benjamini
Y Pavlov
Publication venue: Public Library of Science
Publication date: 16/03/2009
Field of study

Commonly-occurring disease etiology may involve complex combinations of genes and exposures resulting in etiologic heterogeneity. We present a computational algorithm that employs clique-finding for heterogeneity and multidimensionality in biomedical and epidemiological research (the "CHAMBER" algorithm).This algorithm uses graph-building to (1) identify genetic variants that influence disease risk and (2) predict individuals at risk for disease based on inherited genotype. We use a set-covering algorithm to identify optimal cliques and a Boolean function that identifies etiologically heterogeneous groups of individuals. We evaluated this approach using simulated case-control genotype-disease associations involving two- and four-gene patterns. The CHAMBER algorithm correctly identified these simulated etiologies. We also used two population-based case-control studies of breast and endometrial cancer in African American and Caucasian women considering data on genotypes involved in steroid hormone metabolism. We identified novel patterns in both cancer sites that involved genes that sulfate or glucuronidate estrogens or catecholestrogens. These associations were consistent with the hypothesized biological functions of these genes. We also identified cliques representing the joint effect of multiple candidate genes in all groups, suggesting the existence of biologically plausible combinations of hormone metabolism genes in both breast and endometrial cancer in both races.The CHAMBER algorithm may have utility in exploring the multifactorial etiology and etiologic heterogeneity in complex disease

Public Library of Science (PLOS)

Allele-specific network reveals combinatorial interaction that transcends small effects in psoriasis GWAS

Author: Climer Sharlee
Templeton Alan R
Zhang Weixiong
Publication venue: Digital Commons@Becker
Publication date: 01/01/2014
Field of study

<div>Hundreds of genetic markers have shown associations with various complex diseases, yet the “missing heritability” remains alarmingly elusive. Combinatorial interactions may account for a substantial portion of this missing heritability, but their discoveries have been impeded by computational complexity and genetic heterogeneity. We present BlocBuster, a novel systems-level approach that efficiently constructs genome-wide, allele-specific networks that accurately segregate homogenous combinations of genetic factors, tests the associations of these combinations with the given phenotype, and rigorously validates the results using a series of unbiased validation methods. BlocBuster employs a correlation measure that is customized for single nucleotide polymorphisms and returns a multi-faceted collection of values that captures genetic heterogeneity. We applied BlocBuster to analyze psoriasis, discovering a combinatorial pattern with an odds ratio of 3.64 and Bonferroni-corrected p-value of 5.01×10−16. This pattern was replicated in independent data, reflecting robustness of the method. In addition to improving prediction of disease susceptibility and broadening our understanding of the pathogenesis underlying psoriasis, these results demonstrate BlocBuster's potential for discovering combinatorial genetic associations within heterogeneous genome-wide data, thereby transcending the limiting “small effects” produced by individual markers examined in isolation.</div

Digital Commons@Becker