Search CORE

28 research outputs found

Association Studies with Imputed Variants Using Expectation-Maximization Likelihood-Ratio Tests

Author: Karen L. Mohlke (143380)
Kuan-Chieh Huang (657376)
Leslie A. Lange (215336)
Mengjie Chen (483583)
Wei Sun (93580)
Ying Wu (19057)
Yun Li (692280)
Publication venue
Publication date: 01/01/2014
Field of study

<div>Genotype imputation has become standard practice in modern genetic studies. As sequencing-based reference panels continue to grow, increasingly more markers are being well or better imputed but at the same time, even more markers with relatively low minor allele frequency are being imputed with low imputation quality. Here, we propose new methods that incorporate imputation uncertainty for downstream association analysis, with improved power and/or computational efficiency. We consider two scenarios: I) when posterior probabilities of all potential genotypes are estimated; and II) when only the one-dimensional summary statistic, imputed dosage, is available. For scenario I, we have developed an expectation-maximization likelihood-ratio test for association based on posterior probabilities. When only imputed dosages are available (scenario II), we first sample the genotype probabilities from its posterior distribution given the dosages, and then apply the EM-LRT on the sampled probabilities. Our simulations show that type I error of the proposed EM-LRT methods under both scenarios are protected. Compared with existing methods, EM-LRT-Prob (for scenario I) offers optimal statistical power across a wide spectrum of MAF and imputation quality. EM-LRT-Dose (for scenario II) achieves a similar level of statistical power as EM-LRT-Prob and, outperforms the standard Dosage method, especially for markers with relatively low MAF or imputation quality. Applications to two real data sets, the Cebu Longitudinal Health and Nutrition Survey study and the Women’s Health Initiative Study, provide further support to the validity and efficiency of our proposed methods.</div

Directory of Open Access Journals

PubMed Central

Carolina Digital Repository

FigShare

Rejection Sampling vs. Dosage Approximation for Estimation.

Author: Karen L. Mohlke (143380)
Kuan-Chieh Huang (657376)
Leslie A. Lange (215336)
Mengjie Chen (483583)
Wei Sun (93580)
Ying Wu (19057)
Yun Li (692280)
Publication venue
Publication date
Field of study

MAF: Minor allele frequency.MSE: Mean square error.Rejection Sampling vs. Dosage Approximation for Estimation.</p

FigShare

One-sample T-test for Type I Error.

Author: Karen L. Mohlke (143380)
Kuan-Chieh Huang (657376)
Leslie A. Lange (215336)
Mengjie Chen (483583)
Wei Sun (93580)
Ying Wu (19057)
Yun Li (692280)
Publication venue
Publication date
Field of study

*: P-value <5E-4.One-sample T-test for Type I Error.</p

FigShare

Associated Variants with R2≤0.3 in the CLHNS Study.

Author: Karen L. Mohlke (143380)
Kuan-Chieh Huang (657376)
Leslie A. Lange (215336)
Mengjie Chen (483583)
Wei Sun (93580)
Ying Wu (19057)
Yun Li (692280)
Publication venue
Publication date
Field of study

*: Coordinates are in genome build 37.Bold with †: The most significant p-value among the four methods.Bold without †: The second most significant p-values among the four methods.#: Truth was established by regressing phenotype on true genotypes.Associated Variants with R2≤0.3 in the CLHNS Study.</p

FigShare

Type I Error at Significance Level = 5E-02.

Author: Karen L. Mohlke (143380)
Kuan-Chieh Huang (657376)
Leslie A. Lange (215336)
Mengjie Chen (483583)
Wei Sun (93580)
Ying Wu (19057)
Yun Li (692280)
Publication venue
Publication date
Field of study

Type I Error at Significance Level = 5E-02.</p

FigShare

Computing Time: Mixture Method vs EM-LRT-Prob.

Author: Karen L. Mohlke (143380)
Kuan-Chieh Huang (657376)
Leslie A. Lange (215336)
Mengjie Chen (483583)
Wei Sun (93580)
Ying Wu (19057)
Yun Li (692280)
Publication venue
Publication date
Field of study

The computing time of the Mixture method and our proposed EM-LRT-Prob method is displayed across a range of sample sizes. For each sample size, computing time is averaged across 2,000 simulated datasets.</p

FigShare

Q–Q Plot for Null Variants with Low Imputation Quality in the CLHNS Study.

Author: Karen L. Mohlke (143380)
Kuan-Chieh Huang (657376)
Leslie A. Lange (215336)
Mengjie Chen (483583)
Wei Sun (93580)
Ying Wu (19057)
Yun Li (692280)
Publication venue
Publication date
Field of study

The observed (Y-axis) vs. expected (X-axis) –log10[p-values] are shown for 1,135 SNPs in the CLHNS data set. These SNPs are considered to be under the null hypothesis (true p-value >5×10−6), and all have low imputation quality (R2<0.3).</p

FigShare