Machine learning in genome-wide association studies

Amos; Amos; Arshadi; Breiman; Breiman; Breiman; Croiseau; Cupples; D'Angelo; Dasarathy; Dietterich; Díaz-Uriarte; Easton; Frazer; Freund; Friedman; Friedman; García-Magariños; González-Recio; Hastie; Heckerman; Hoerl; Kim; Kraja; Lettre; Malo; Marchini; Meier; Mohlke; Park; Pearl; Plenge; Ripley; Rumelhart; Samani; Schwarz; Schwarz; Sebastiani; Stassen; Strobl; Sun; Sun; Sun; Tang; Tibshirani; Tomita; Vapnik; Wan; Wang; Wu; Yang; Yuan; Ziegler

Machine learning in genome-wide association studies

Authors: Amos
Amos
Arshadi
Breiman
Breiman
Breiman
Croiseau
Cupples
D'Angelo
Dasarathy
Dietterich
Díaz-Uriarte
Easton
Frazer
Freund
Friedman
Friedman
García-Magariños
González-Recio
Hastie
Heckerman
Hoerl
Kim
Kraja
Lettre
Malo
Marchini
Meier
Mohlke
Park
Pearl
Plenge
Ripley
Rumelhart
Samani
Schwarz
Schwarz
Sebastiani
Stassen
Strobl
Sun
Sun
Sun
Tang
Tibshirani
Tomita
Vapnik
Wan
Wang
Wu
Yang
Yuan
Ziegler
Publication date: 1 January 2009
Publisher: 'Wiley'
Doi

Abstract

Recently, genome-wide association studies have substantially expanded our knowledge about genetic variants that influence the susceptibility to complex diseases. Although standard statistical tests for each single-nucleotide polymorphism (SNP) separately are able to capture main genetic effects, different approaches are necessary to identify SNPs that influence disease risk jointly or in complex interactions. Experimental and simulated genome-wide SNP data provided by the Genetic Analysis Workshop 16 afforded an opportunity to analyze the applicability and benefit of several machine learning methods. Penalized regression, ensemble methods, and network analyses resulted in several new findings while known and simulated genetic risk variants were also identified. In conclusion, machine learning approaches are promising complements to standard single-and multi-SNP analysis methods for understanding the overall genetic architecture of complex human diseases. However, because they are not optimized for genome-wide SNP data, improved implementations and new variable selection procedures are required. Genet. Epidemiol . 33 (Suppl. 1):S51–S57, 2009. © 2009 Wiley-Liss, Inc.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/64533/1/20473_ftp.pd

Similar works

Full text

Available Versions

Deep Blue at the University of Michigan

oai:deepblue.lib.umich.edu:202...

Last time updated on 25/05/2012

Crossref

Last time updated on 25/03/2021