CORE
CO
nnecting
RE
positories
Services
Services overview
Explore all CORE services
Access to raw data
API
Dataset
FastSync
Content discovery
Recommender
Discovery
OAI identifiers
OAI Resolver
Managing content
Dashboard
Bespoke contracts
Consultancy services
Support us
Support us
Membership
Sponsorship
Research partnership
About
About
About us
Our mission
Team
Blog
FAQs
Contact us
Community governance
Governance
Advisory Board
Board of supporters
Research network
Innovations
Our research
Labs
Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso
Authors
C Hoggart
C Yang
+30 more
Can Yang
D Balding
D Velez
H Cordell
HJ Cordell
Hong Xue
J Friedman
J Marchini
J Millstein
J Moore
J Zhao
K Wang
L Meier
M Nelson
M Ritchie
M Yuan
PK Gregersen
Qiang Yang
R Culverhouse
R Culverhouse
R Tibshirani
S Dudek
S Mori
T Wu
W Li
Weichuan Yu
WTCCC
Xiang Wan
Y Cho
Y Zhang
Publication date
1 January 2010
Publisher
BioMed Central
Doi
Cite
View
on
PubMed
Abstract
Background: Single nucleotide polymorphism (SNP) based association studies aim at identifying SNPs associated with phenotypes, for example, complex diseases. The associated SNPs may influence the disease risk individually (main effects) or behave jointly (epistatic interactions). For the analysis of high throughput data, the main difficulty is that the number of SNPs far exceeds the number of samples. This difficulty is amplified when identifying interactions. Results: In this paper, we propose an Adaptive Group Lasso (AGL) model for large-scale association studies. Our model enables us to analyze SNPs and their interactions simultaneously. We achieve this by introducing a sparsity constraint in our model based on the fact that only a small fraction of SNPs is disease-associated. In order to reduce the number of false positive findings, we develop an adaptive reweighting scheme to enhance sparsity. In addition, our method treats SNPs and their interactions as factors, and identifies them in a grouped manner. Thus, it is flexible to analyze various disease models, especially for interaction detection. However, due to the intensive computation when millions of interaction terms needs to be searched in the model fitting, our method needs to combined with some filtering methods when applied to genome-wide data for detecting interactions. Conclusion: By using a wide range of simulated datasets and a real dataset from WTCCC, we demonstrate the advantages of our method. © 2010 Yang et al; licensee BioMed Central Ltd
Similar works
Full text
Open in the Core reader
Download PDF
Available Versions
Crossref
See this paper in CORE
Go to the repository landing page
Download from data provider
Last time updated on 18/03/2019
Hong Kong University of Science and Technology Institutional Repository
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:repository.hkust.edu.hk:17...
Last time updated on 30/07/2022
Springer - Publisher Connector
See this paper in CORE
Go to the repository landing page
Download from data provider
Last time updated on 28/04/2017
Springer - Publisher Connector
See this paper in CORE
Go to the repository landing page
Download from data provider
Last time updated on 05/06/2019