Search CORE

185 research outputs found

AntEpiSeeker: detecting epistatic interactions for case-control studies using a two-stage ant colony optimization algorithm

Author: C Yang
CS Greene
CS Greene
CT Tsai
EG Talbi
HJ Cordell
HJ Cordell
J Marchini
J Millstein
Kelly Robbins
KR Robbins
M Dorigio
M Dorigo
M Dorigo
M Dorigo
MD Ritchie
MR Nelson
R Culverhouse
R Jiang
RJ Klein
RM Plenge
Romdhane Rekaya
S Raychaudhuri
T Zeng
The International HapMap Consortium
V Maniezzo
Wellcome Trust Case Control Consortium
X Wan
Xinyu Liu
Y Zhang
YM Cho
Yupeng Wang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Epistatic interactions of multiple single nucleotide polymorphisms (SNPs) are now believed to affect individual susceptibility to common diseases. The detection of such interactions, however, is a challenging task in large scale association studies. Ant colony optimization (ACO) algorithms have been shown to be useful in detecting epistatic interactions. Findings AntEpiSeeker, a new two-stage ant colony optimization algorithm, has been developed for detecting epistasis in a case-control design. Based on some practical epistatic models, AntEpiSeeker has performed very well. Conclusions AntEpiSeeker is a powerful and efficient tool for large-scale association studies and can be downloaded from <url>http://nce.ads.uga.edu/~romdhane/AntEpiSeeker/index.html</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Evaluation of Existing Methods for High-Order Epistasis Detection

Author: Carvajal-Rodriguez Antonio
González-Domínguez Jorge
Martín María J.
Ponte-Fernández Christian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/10/2020
Field of study

[Abstract] Finding epistatic interactions among loci when expressing a phenotype is a widely employed strategy to understand the genetic architecture of complex traits in GWAS. The abundance of methods dedicated to the same purpose, however, makes it increasingly difficult for scientists to decide which method is more suitable for their studies. This work compares the different epistasis detection methods published during the last decade in terms of runtime, detection power and type I error rate, with a special emphasis on high-order interactions. Results show that in terms of detection power, the only methods that perform well across all experiments are the exhaustive methods, although their computational cost may be prohibitive in large-scale studies. Regarding non-exhaustive methods, not one could consistently find epistasis interactions when marginal effects are absent. If marginal effects are present, there are methods that perform well for high-order interactions, such as BADTrees, FDHE-IW, SingleMI or SNPHarvester. As for false-positive control, only SNPHarvester, FDHE-IW and DCHE show good results. The study concludes that there is no single epistasis detection method to recommend in all scenarios. Authors should prioritize exhaustive methods when sufficient computational resources are available considering the data set size, and resort to non-exhaustive methods when the analysis time is prohibitive.10.13039/501100010801-Xunta de Galicia (Grant Number: ED431C2016-037, ED431C2017/04 and ED431G2019/01) 10.13039/501100003176-Ministerio de Educacion Cultura y Deporte (Grant Number: FPU16/01333) 10.13039/501100003329-Ministerio de Economia y Competitividad (Grant Number: CGL2016-75482-P, PID2019-104184RB-I00, AEI/FEDER/EU, 10.13039/50110 and TIN2016-75845-P)Xunta de Galicia; ED431C2016-037Xunta de Galicia; ED431G2019/01Xunta de Galicia; ED431C 2017/0

Repositorio da Universidade da Coruña

Performance analysis of novel methods for detecting epistasis

Author: Liu Dan
Shang Junliang
Sun Yan
Ye Daojun
Yin Yaling
Zhang Junying
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Springer - Publisher Connector

PubMed Central

High-Order Epistasis Detection in High Performance Computing Systems

Author: Ponte-Fernández Christian
Publication venue
Publication date: 01/01/2022
Field of study

Programa Oficial de Doutoramento en Investigación en Tecnoloxías da Información. 524V01[Resumo] Nos últimos anos, os estudos de asociación do xenoma completo (Genome-Wide Association Studies, GWAS) están a gañar moita popularidade de cara a buscar unha explicación xenética á presenza ou ausencia de certas enfermidades nos humanos.Hai un consenso nestes estudos sobre a existencia de interaccións xenéticas que condicionan a expresión de enfermidades complexas, un fenómeno coñecido como epistasia. Esta tese céntrase no estudo deste fenómeno empregando a computación de altas prestacións (High-Performance Computing, HPC) e dende a súa perspectiva estadística: a desviación da expresión dun fenotipo como a suma dos efectos individuais de múltiples variantes xenéticas. Con este obxectivo desenvolvemos unha primeira ferramenta, chamada MPI3SNP, que identifica interaccións de tres variantes a partir dun conxunto de datos de entrada. MPI3SNP implementa unha busca exhaustiva empregando un test de asociación baseado na Información Mutua, e explota os recursos de clústeres de CPUs ou GPUs para acelerar a busca. Coa axuda desta ferramenta avaliamos o estado da arte da detección de epistasia a través dun estudo que compara o rendemento de vintesete ferramentas. A conclusión máis importante desta comparativa é a incapacidade dos métodos non exhaustivos de atopar interacción ante a ausencia de efectos marxinais (pequenos efectos de asociación das variantes individuais que participan na epistasia). Por isto, esta tese continuou centrándose na optimización da busca exhaustiva de epistasia. Por unha parte, mellorouse a eficiencia do test de asociación a través dunha implantación vectorial do mesmo. Por outro lado, creouse un algoritmo distribuído que implementa unha busca exhaustiva capaz de atopar epistasia de calquera orden. Estes dous fitos lógranse en Fiuncho, unha ferramenta que integra toda a investigación realizada, obtendo un rendemento en clústeres de CPUs que supera a todas as súas alternativas no estado da arte. Adicionalmente, desenvolveuse unha libraría para simular escenarios biolóxicos con epistasia chamada Toxo. Esta libraría permite a simulación de epistasia seguindo modelos de interacción xenética existentes para orde alto.[Resumen] En los últimos años, los estudios de asociación del genoma completo (Genome- Wide Association Studies, GWAS) están ganando mucha popularidad de cara a buscar una explicación genética a la presencia o ausencia de ciertas enfermedades en los seres humanos. Existe un consenso entre estos estudios acerca de que muchas enfermedades complejas presentan interacciones entre los diferentes genes que intervienen en su expresión, un fenómeno conocido como epistasia. Esta tesis se centra en el estudio de este fenómeno empleando la computación de altas prestaciones (High-Performance Computing, HPC) y desde su perspectiva estadística: la desviación de la expresión de un fenotipo como suma de los efectos de múltiples variantes genéticas. Para ello se ha desarrollado una primera herramienta, MPI3SNP, que identifica interacciones de tres variantes a partir de un conjunto de datos de entrada. MPI3SNP implementa una búsqueda exhaustiva empleando un test de asociación basado en la Información Mutua, y explota los recursos de clústeres de CPUs o GPUs para acelerar la búsqueda. Con la ayuda de esta herramienta, hemos evaluado el estado del arte de la detección de epistasia a través de un estudio que compara el rendimiento de veintisiete herramientas. La conclusión más importante de esta comparativa es la incapacidad de los métodos no exhaustivos de localizar interacciones ante la ausencia de efectos marginales (pequeños efectos de asociación de variantes individuales pertenecientes a una relación epistática). Por ello, esta tesis continuó centrándose en la optimización de la búsqueda exhaustiva. Por un lado, se mejoró la eficiencia del test de asociación a través de una implementación vectorial del mismo. Por otra parte, se diseñó un algoritmo distribuido que implementa una búsqueda exhaustiva capaz de encontrar relaciones epistáticas de cualquier tamaño. Estos dos hitos se logran en Fiuncho, una herramienta que integra toda la investigación realizada, obteniendo un rendimiento en clústeres de CPUs que supera a todas sus alternativas del estado del arte. A mayores, también se ha desarrollado una librería para simular escenarios biológicos con epistasia llamada Toxo. Esta librería permite la simulación de epistasia siguiendomodelos de interacción existentes para orden alto.[Abstract] In recent years, Genome-Wide Association Studies (GWAS) have become more and more popular with the intent of finding a genetic explanation for the presence or absence of particular diseases in human studies. There is consensus about the presence of genetic interactions during the expression of complex diseases, a phenomenon called epistasis. This thesis focuses on the study of this phenomenon, employingHigh- Performance Computing (HPC) for this purpose and from a statistical definition of the problem: the deviation of the expression of a phenotype from the addition of the individual contributions of genetic variants. For this purpose, we first developedMPI3SNP, a programthat identifies interactions of three variants froman input dataset. MPI3SNP implements an exhaustive search of epistasis using an association test based on the Mutual Information and exploits the resources of clusters of CPUs or GPUs to speed up the search. Then, we evaluated the state-of-the-art methods with the help of MPI3SNP in a study that compares the performance of twenty-seven tools. The most important conclusion of this study is the inability of non-exhaustive approaches to locate epistasis in the absence of marginal effects (small association effects of individual variants that partake in an epistasis interaction). For this reason, this thesis continued focusing on the optimization of the exhaustive search. First, we improved the efficiency of the association test through a vector implementation of this procedure. Then, we developed a distributed algorithm capable of locating epistasis interactions of any order. These two milestones were achieved in Fiuncho, a program that incorporates all the research carried out, obtaining the best performance in CPU clusters out of all the alternatives of the state-of-the-art. In addition, we also developed a library to simulate particular scenarios with epistasis called Toxo. This library allows for the simulation of epistasis that follows existing interaction models for high-order interactions

Repositorio da Universidade da Coruña

Discovering Higher-order SNP Interactions in High-dimensional Genomic Data

Author: Uppu Suneetha
Publication venue: Curtin University
Publication date: 01/01/2018
Field of study

In this thesis, a multifactor dimensionality reduction based method on associative classification is employed to identify higher-order SNP interactions for enhancing the understanding of the genetic architecture of complex diseases. Further, this thesis explored the application of deep learning techniques by providing new clues into the interaction analysis. The performance of the deep learning method is maximized by unifying deep neural networks with a random forest for achieving reliable interactions in the presence of noise

espace@Curtin

Grammatical evolution decision trees for detecting gene-gene interactions

Author: AA Motsinger
AA Motsinger-Reif
AA Motsinger-Reif
Alison A Motsinger-Reif
BA Shepherd
BLG Miller
CS Greene
D Altshuler
DB Goldstein
DR Velez
E Alpaydin
E Cantu-Paz
HJ Cordell
IH Witten
J Koza
J Koza
JH Moore
JH Moore
JH Moore
JH Moore
JH Moore
JN Hirschhorn
JR Quinlan
JS Aguilar-Ruiz
L Brieman
LGL Devroy
M Hall
M O'Neill
M O'Neill
MD Ritchie
MR Nelson
Nicholas E Hardison
R Bellman
R Culverhouse
RJ Neuman
SM Dudek
Stacey J Winham
Sushamna Deodhar
TJ Hastie
W Li
X Yao
Publication venue: BioMed Central
Publication date: 01/11/2010
Field of study

Abstract Background A fundamental goal of human genetics is the discovery of polymorphisms that predict common, complex diseases. It is hypothesized that complex diseases are due to a myriad of factors including environmental exposures and complex genetic risk models, including gene-gene interactions. Such epistatic models present an important analytical challenge, requiring that methods perform not only statistical modeling, but also variable selection to generate testable genetic model hypotheses. This challenge is amplified by recent advances in genotyping technology, as the number of potential predictor variables is rapidly increasing. Methods Decision trees are a highly successful, easily interpretable data-mining method that are typically optimized with a hierarchical model building approach, which limits their potential to identify interacting effects. To overcome this limitation, we utilize evolutionary computation, specifically grammatical evolution, to build decision trees to detect and model gene-gene interactions. In the current study, we introduce the Grammatical Evolution Decision Trees (GEDT) method and software and evaluate this approach on simulated data representing gene-gene interaction models of a range of effect sizes. We compare the performance of the method to a traditional decision tree algorithm and a random search approach and demonstrate the improved performance of the method to detect purely epistatic interactions. Results The results of our simulations demonstrate that GEDT has high power to detect even very moderate genetic risk models. GEDT has high power to detect interactions with and without main effects. Conclusions GEDT, while still in its initial stages of development, is a promising new approach for identifying gene-gene interactions in genetic association studies.</p

Crossref

Directory of Open Access Journals

PubMed Central

Spatially Uniform ReliefF (SURF) for computationally-efficient filtering of gene-gene interactions

Author: A Motsinger
AL Tyler
B McKinney
BA McKinney
Casey S Greene
CS Greene
CS Greene
CS Greene
I Kononenko
J Hardy
J Jakobsdottir
Jason H Moore
Jeff Kiralis
JH Moore
JH Moore
JH Moore
JH Moore
JH Moore
JH Moore
JN Hirschhorn
K Kira
L Beretta
M Robnik-Sikonja
M Robnik-Sikonja
MI McCarthy
MM Iles
Nadia M Penrod
P Kraft
RR Sokal
U Finckh
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Genome-wide association studies are becoming the de facto standard in the genetic analysis of common human diseases. Given the complexity and robustness of biological networks such diseases are unlikely to be the result of single points of failure but instead likely arise from the joint failure of two or more interacting components. The hope in genome-wide screens is that these points of failure can be linked to single nucleotide polymorphisms (SNPs) which confer disease susceptibility. Detecting interacting variants that lead to disease in the absence of single-gene effects is difficult however, and methods to exhaustively analyze sets of these variants for interactions are combinatorial in nature thus making them computationally infeasible. Efficient algorithms which can detect interacting SNPs are needed. ReliefF is one such promising algorithm, although it has low success rate for noisy datasets when the interaction effect is small. ReliefF has been paired with an iterative approach, Tuned ReliefF (TuRF), which improves the estimation of weights in noisy data but does not fundamentally change the underlying ReliefF algorithm. To improve the sensitivity of studies using these methods to detect small effects we introduce Spatially Uniform ReliefF (SURF). Results SURF's ability to detect interactions in this domain is significantly greater than that of ReliefF. Similarly SURF, in combination with the TuRF strategy significantly outperforms TuRF alone for SNP selection under an epistasis model. It is important to note that this success rate increase does not require an increase in algorithmic complexity and allows for increased success rate, even with the removal of a nuisance parameter from the algorithm. Conclusion Researchers performing genetic association studies and aiming to discover gene-gene interactions associated with increased disease susceptibility should use SURF in place of ReliefF. For instance, SURF should be used instead of ReliefF to filter a dataset before an exhaustive MDR analysis. This change increases the ability of a study to detect gene-gene interactions. The SURF algorithm is implemented in the open source Multifactor Dimensionality Reduction (MDR) software package available from <url>http://www.epistasis.org</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Dartmouth Digital Commons (Dartmouth College)

Ant Colony Optimisation for Exploring Logical Gene-Gene Associations in Genome Wide Association Studies.

Author: Frayling TM
Keedwell E
Sapin E
Publication venue: Copicentro Editorial
Publication date: 31/03/2016
Field of study

In this paper a search for the logical variants of gene-gene interactions in genome-wide association study (GWAS) data using ant colony optimisation is proposed. The method based on stochastic algorithms is tested on a large established database from the Wellcome Trust Case Control Consortium and is shown to discover logical operations between combinations of single nucleotide polymorphisms that can discriminate Type II diabetes. A variety of logical combinations are explored and the best discovered associations are found within reasonable computational time and are shown to be statistically significantThis study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the investigators who contributed to the generation of the data is available from http://www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under award 076113. The work contained in this paper was funded by an EPSRC First Grant (EP/J007439/1) and we acknowledge their kind support

Open Research Exeter

Ant colony algorithm for analysis of gene interaction in high-dimensional association data

Author: BARENDSE W.
BENJAMINI Y.
COUTINHO A.M.
DING Y.P.
DORIGIO M.
GONZALEZ J.R.
HUGOT J.P.
Kelly Robbins
KOOPERBURGE C.
KREIGER M.J.B.
MARCHINI J.
PICKRELL J.
RESSOM H.W.
ROBBINS K.R.
Romdhane Rekaya
SHYMYGELSKA A.
SINNWELL J.P.
WOON P.Y.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref