16 research outputs found
A correlation between the variance of phenotype explained by the GxE kinship matrix () and the inflation factor on GEI statistics (<i>λ</i><sub><i>GC</i></sub>) for HMDP GxE GWAS data.
<p>The correlation is plotted for the three methods; OLS (A), One RE (B), and Two RE (C). Each dot is each phenotype. The red line is a regression line between <i>λ</i><sub><i>GC</i></sub> and , and Pearson correlation coefficient is indicated on the bottom right of the plot.</p
A distribution of inflation factors of GEI statistics on simulated 1000 Genomes data.
<p>We simulate genotype data using two populations (GBR and TSI), and genetic kinship (<i>K</i>) and GxE kinship (<i>K</i><sup><i>D</i></sup>) explain 40% and 20% of phenotypic variance, respectively. We generate 100 replicates of simulation, and measure inflation factors of three methods for each replicate; OLS, One RE, and Two RE. Y-axis is the inflation factor, and horizontal red line is drawn at <i>λ</i><sub><i>GC</i></sub> = 1. We assume a dichotomous environmental status where the two populations have the same number of exposed and unexposed samples (<b>A</b>) and where one population has more exposed samples than the other population (<b>B</b>).</p
A distribution of inflation factors of GEI statistics on human eQTL GxE GWAS data.
<p>After filtering out probes whose expression values do not follow the normal distribution, 8,666 probes are tested for associations with about 500,000 SNPs. Gene expression of each individual was measured with and without the Ox-PAPC treatment, which corresponds to the environmental exposure. About a half of individuals were chosen to represent samples exposed to the environment, and the rest of individuals represent samples unexposed to the environment. We compute the inflation factor for each probe and for each of the three methods. Boxplots are drawn with outliers (<b>A</b>) and without outliers (<b>B</b>).</p
Variance of phenotype explained by the genetic kinship matrix (), variance of phenotype explained by the GxE kinship matrix () and inflation factors for the three methods on GEI statistics for each phenotype of HMDP GxE GWAS data.
<p>Full name of each phenotype is discussed in Material and Methods section. GCTA software is utilized to estimate the phenotypic variance and its standard error for each phenotype.</p
A correlation between the variance of phenotype explained by the GxE kinship matrix () and the inflation factor on GEI statistics (<i>λ</i><sub><i>GC</i></sub>) for human eQTL GxE GWAS data.
<p>The correlation is plotted for the three methods; OLS (<b>A</b>), One RE (<b>B</b>), and Two RE (<b>C</b>). Each dot is each probe, and x-axis is and y-axis is <i>λ</i><sub><i>GC</i></sub>. We estimate using the GCTA software, and only probes with are shown in the plots. The red line is a regression line between <i>λ</i><sub><i>GC</i></sub> and , and Pearson correlation coefficient is indicated on the top right of the plot.</p
HDL SNPs with high confidence for causality.
<p>SNPs with posterior probability causality for HDL phenotype across the 37 risk loci (Results for TG/TC/LDL in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004722#pgen.1004722.s015" target="_blank">Tables S5</a>, <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004722#pgen.1004722.s016" target="_blank">S6</a>, <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004722#pgen.1004722.s017" target="_blank">S7</a>).</p><p> denotes a non-synonymous variant.</p><p>HDL SNPs with high confidence for causality.</p
Thresholding on posterior probabilities provides a principled way to assess utility.
<p>We demonstrate how utility curves are optimized by selecting SNPs that achieve a minimum posterior probability threshold at various benefit-to-cost ratios (R). The total number of SNPs selected at the maximum utility for R = (1.25, 1.5, 2, 5, 10, 20) is (29.8, 39.2, 52.4, 119.1, 221.4, 405.4) which identifies approximately (29.8, 35.6, 43.4, 65.33, 79.9, 91.8) causal variants.</p
PAINTOR outperforms existing methodologies for fine-mapping.
<p>We simulated datasets consisting of 10 K genotypes over one hundred 10 KB loci using three synthetic functional annotations randomly dispersed at fixed percentages (2.2%, 2.2%, 30.7%). SNPs falling within these annotations were enriched (9.5, 5.7, 3.65) times more with causal variants relative to unannotated SNPs. We fixed the variance explained by these loci to and repeated the simulation 500 times. The top figure corresponds to the overall performance at causal loci (64 loci) with PAINTOR clearly achieving the greatest overall accuracy. The bottom figures correspond to loci with a single causal variant (an average of 34 per simulation) (left) or multiple causal variants (average of 30 per simulation) (right). At loci where there is one true causal variant, fgwas achieves greater accuracy than PAINTOR due to the fact that fgwas assumes the correct number of causal variants. We note that the version of PAINTOR that assumes a single causal variant yields very similar to fgwas at loci where the truth is of a single causal (both requiring 2.63 SNPs per locus to identify 90% of the causal variants.) However, at loci with multiple causal variants, the power of methods that assume a single causal is greatly deflated leading to PAINTOR's superior overall accuracy.</p
Accuracy of enrichment estimation for a synthetic annotation that contains 8-fold depletion to 8-fold enrichment of causal variants across simulations of fine-mapping data sets over 100 loci.
<p>Using a background and a synthetic functional annotation at a frequency of 1/3 (), we simulated with annotation effect sizes such that in expectation, we attained approximately 100 causal variants while maintaining enrichment at a fixed point. We used the standard simulation parameters, fixing the variance explained by these 100 loci to 0.25 and using genotypes. We discarded simulations where fgwas failed to converge (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004722#s4" target="_blank">Methods</a>). Displayed here are the mean inferred Log2 enrichment estimates ( 1 SD) that were conducted over 500 independent simulations at each enrichment level.</p
Reduction in the number of SNPs in the 90% Credible Set after incorporating functional annotations.
<p>Shown here are the number of SNPs in the 90% Confidence Set for each of the lipid phenotypes as estimated using PAINTOR. After marginally running PAINTOR on the entire pool of annotations, we selected the top five annotations for each trait and fit full trait-specific models on each of the densely imputed data sets. We compared PAINTOR with or without integration of functional annotation data. The magnitude in the reduction in the size of the confidence set approximately mirrors what we observe in simulations.</p><p>Reduction in the number of SNPs in the 90% Credible Set after incorporating functional annotations.</p