Search CORE

35 research outputs found

Different types of correlation coefficients when the variables X and Y are continuous, binary, or ordinal.

Author: Anne Buu (3636628)
James J. Yang (283590)
L. Keoki Williams (156461)
Publication venue
Publication date
Field of study

Different types of correlation coefficients when the variables X and Y are continuous, binary, or ordinal.</p

FigShare

The Q-Q plot of observed p-values versus expected p-values based on the multivariate test.

Author: Anne Buu (3636628)
James J. Yang (283590)
L. Keoki Williams (156461)
Publication venue
Publication date
Field of study

The Q-Q plot of observed p-values versus expected p-values based on the multivariate test.</p

FigShare

Simulation results for the empirical power with varied correlations ρ and genetic effect sizes (e1, …, e6).

Author: Anne Buu (3636628)
James J. Yang (283590)
L. Keoki Williams (156461)
Publication venue
Publication date
Field of study

The Latent column is the power with the proposed method applied to 6 latent phenotypes. The Mixed column is the power with the proposed method applied to 6 observed phenotypes. The Dichotomous column is the power when the observed phenotypes are dichotomized. The Continuous column is the power when the phenotypes in mixed measurements are treated as continuous variables. The number of iterations is 106 when all genetic effect sizes are zero and 104 for other situations.</p

FigShare

The correlations among the 6 FTND items.

Author: Anne Buu (3636628)
James J. Yang (283590)
L. Keoki Williams (156461)
Publication venue
Publication date
Field of study

The correlations among the 6 FTND items.</p

FigShare

The distributions of phenotypes: FTND 1, FTND 2, …, FTND 6, FTND total, and FTND total (Binary).

Author: Anne Buu (3636628)
James J. Yang (283590)
L. Keoki Williams (156461)
Publication venue
Publication date
Field of study

FTND total (Binary) is derived from FTDN total according to whether FTND total score is less than 6 or not.</p

FigShare

The relationship between the covariance cov[−2log(pu), −2log(pv)] and the correlation ρ.

Author: Anne Buu (3636628)
James J. Yang (283590)
L. Keoki Williams (156461)
Publication venue
Publication date
Field of study

The title in each panel indicates the types of data simulated. The solid curve in each panel corresponds to our covariance estimates using <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0169893#pone.0169893.e005" target="_blank">Eq (3)</a>. The dotted curves are the true covariances calculated from the simulated data.</p

FigShare

The Q-Q plots of observed p-values versus expected p-values based on the marginal tests.

Author: Anne Buu (3636628)
James J. Yang (283590)
L. Keoki Williams (156461)
Publication venue
Publication date
Field of study

The Q-Q plots of observed p-values versus expected p-values based on the marginal tests.</p

FigShare

Comparison of empirical Power and Type-1 error rates of gene-based association tests for a quantitative trait simulated under models with interactions.

Author: Albert M. Levin (111384)
Badri Padhukasahasram (59074)
Chandan K. Reddy (749750)
Esteban G. Burchard (156540)
L. Keoki Williams (156461)
Publication venue
Publication date
Field of study

TAS denotes the number of trait associated SNPs. Machine learning test is based on ensemble learning variation 1 with the following components: multiple linear regression, support vector machine with linear kernel and random forests with mtry = 1 and ntree = 1000.Comparison of empirical Power and Type-1 error rates of gene-based association tests for a quantitative trait simulated under models with interactions.</p

FigShare

Comparison of empirical power and Type-1 error rates of gene-based association tests for simulated datasets assuming linkage equilibrium.

Author: Albert M. Levin (111384)
Badri Padhukasahasram (59074)
Chandan K. Reddy (749750)
Esteban G. Burchard (156540)
L. Keoki Williams (156461)
Publication venue
Publication date
Field of study

DSL denotes the number of disease susceptibility markers. Machine learning test is based on ensemble learning variation 1 with the following components: logistic regression, support vector machine with linear kernel and random forests with mtry = 1 and ntree = 1000.Comparison of empirical power and Type-1 error rates of gene-based association tests for simulated datasets assuming linkage equilibrium.</p

FigShare

Powerful Tests for Multi-Marker Association Analysis Using Ensemble Learning

Author: Albert M. Levin (111384)
Badri Padhukasahasram (59074)
Chandan K. Reddy (749750)
Esteban G. Burchard (156540)
L. Keoki Williams (156461)
Publication venue
Publication date: 01/01/2015
Field of study

<div>Multi-marker approaches have received a lot of attention recently in genome wide association studies and can enhance power to detect new associations under certain conditions. Gene-, gene-set- and pathway-based association tests are increasingly being viewed as useful supplements to the more widely used single marker association analysis which have successfully uncovered numerous disease variants. A major drawback of single-marker based methods is that they do not look at the joint effects of multiple genetic variants which individually may have weak or moderate signals. Here, we describe novel tests for multi-marker association analyses that are based on phenotype predictions obtained from machine learning algorithms. Instead of assuming a linear or logistic regression model, we propose the use of ensembles of diverse machine learning algorithms for prediction. We show that phenotype predictions obtained from ensemble learning algorithms provide a new framework for multi-marker association analysis. They can be used for constructing tests for the joint association of multiple variants, adjusting for covariates and testing for the presence of interactions. To demonstrate the power and utility of this new approach, we first apply our method to simulated SNP datasets. We show that the proposed method has the correct Type-1 error rates and can be considerably more powerful than alternative approaches in some situations. Then, we apply our method to previously studied asthma-related genes in 2 independent asthma cohorts to conduct association tests.</div

Crossref

Henry Ford Health System Scholarly Commons

Directory of Open Access Journals

PubMed Central

FigShare

Different types of correlation coefficients when the variables <i>X</i> and <i>Y</i> are continuous, binary, or ordinal.

The Q-Q plot of observed <i>p</i>-values versus expected <i>p</i>-values based on the multivariate test.

Simulation results for the empirical power with varied correlations <i>ρ</i> and genetic effect sizes (<i>e</i><sub>1</sub>, …, <i>e</i><sub>6</sub>).

The correlations among the 6 FTND items.

The distributions of phenotypes: FTND 1, FTND 2, …, FTND 6, FTND total, and FTND total (Binary).

The relationship between the covariance <i>cov</i>[−2<i>log</i>(<i>p</i><sub><i>u</i></sub>), −2<i>log</i>(<i>p</i><sub><i>v</i></sub>)] and the correlation <i>ρ</i>.

The Q-Q plots of observed <i>p</i>-values versus expected <i>p</i>-values based on the marginal tests.

Comparison of empirical Power and Type-1 error rates of gene-based association tests for a quantitative trait simulated under models with interactions.

Comparison of empirical power and Type-1 error rates of gene-based association tests for simulated datasets assuming linkage equilibrium.

Powerful Tests for Multi-Marker Association Analysis Using Ensemble Learning