CORE
🇺🇦
make metadata, not war
Services
Services overview
Explore all CORE services
Access to raw data
API
Dataset
FastSync
Content discovery
Recommender
Discovery
OAI identifiers
OAI Resolver
Managing content
Dashboard
Bespoke contracts
Consultancy services
Support us
Support us
Membership
Sponsorship
Community governance
Advisory Board
Board of supporters
Research network
About
About us
Our mission
Team
Blog
FAQs
Contact us
research
Mining Pure, Strict Epistatic Interactions from High-Dimensional Datasets: Ameliorating the Curse of Dimensionality
Authors
Richard E. Neapolitan
Xia Jiang
Xiaofeng Wang
Publication date
12 October 2012
Publisher
'Public Library of Science (PLoS)'
Doi
Cite
View
on
PubMed
Abstract
Background: The interaction between loci to affect phenotype is called epistasis. It is strict epistasis if no proper subset of the interacting loci exhibits a marginal effect. For many diseases, it is likely that unknown epistatic interactions affect disease susceptibility. A difficulty when mining epistatic interactions from high-dimensional datasets concerns the curse of dimensionality. There are too many combinations of SNPs to perform an exhaustive search. A method that could locate strict epistasis without an exhaustive search can be considered the brass ring of methods for analyzing high-dimensional datasets. Methodology/Findings: A SNP pattern is a Bayesian network representing SNP-disease relationships. The Bayesian score for a SNP pattern is the probability of the data given the pattern, and has been used to learn SNP patterns. We identified a bound for the score of a SNP pattern. The bound provides an upper limit on the Bayesian score of any pattern that could be obtained by expanding a given pattern. We felt that the bound might enable the data to say something about the promise of expanding a 1-SNP pattern even when there are no marginal effects. We tested the bound using simulated datasets and semi-synthetic high-dimensional datasets obtained from GWAS datasets. We found that the bound was able to dramatically reduce the search time for strict epistasis. Using an Alzheimer's dataset, we showed that it is possible to discover an interaction involving the APOE gene based on its score because of its large marginal effect, but that the bound is most effective at discovering interactions without marginal effects. Conclusions/Significance: We conclude that the bound appears to ameliorate the curse of dimensionality in high-dimensional datasets. This is a very consequential result and could be pivotal in our efforts to reveal the dark matter of genetic disease risk from high-dimensional datasets. © 2012 Jiang, Neapolitan
Similar works
Full text
Open in the Core reader
Download PDF
Available Versions
Public Library of Science (PLOS)
See this paper in CORE
Go to the repository landing page
Download from data provider
Last time updated on 05/06/2019
Crossref
See this paper in CORE
Go to the repository landing page
Download from data provider
info:doi/10.1371%2Fjournal.pon...
Last time updated on 10/12/2019
Name not available
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:d-scholarship.pitt.edu:160...
Last time updated on 23/11/2016
D-Scholarship@Pitt
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:d-scholarship.pitt.edu:160...
Last time updated on 19/07/2013
Public Library of Science (PLOS)
See this paper in CORE
Go to the repository landing page
Download from data provider
Last time updated on 18/09/2018
Directory of Open Access Journals
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:doaj.org/article:e585593f3...
Last time updated on 09/08/2016
Name not available
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:d-scholarship.pitt.edu:160...
Last time updated on 15/12/2016