22 research outputs found

    Simulation results for comparing polygenic risk prediction methods and different high priority SNP sets.

    No full text
    <p>Quantitative traits were simulated conditioning on the genotypes of LD-pruned SNPs in lung cancer GWAS with 10,000 discovery samples and 1,924 validation samples. For each simulation, we used 5,000 causal SNPs and 9,940 high priority (HP) SNPs (either randomly selected or the SNPs related with conserved regions). Δ denotes the enrichment fold change of the HP SNP. In the x-axis, “1D” denotes 1D PRS without winner’s curse correction; “1D-LASSO(MLE)” denotes 1D PRS with lasso-type (MLE) correction; “2D-random” indicates 2D PRS with HP SNP sets randomly selected from the LD-pruned SNPs in the genome; “2D-CR” indicates 2D PRS using SNPs in conserved regions as HP SNPs.</p

    Summary of genes in the aromatic amine metabolism pathway used for pathway-based analysis of multi-study bladder cancer GWAS.

    No full text
    1<p>Number of SNPs genotyped in the gene region (20 kb 5′ upstream and 10 kb 3′ downstream from the gene's coding region).</p>2<p>The SNP representing the gene in the pathway analysis after the removal of SNPs with heterogeneous effects.</p>3<p>The rank of the SNP among all SNPs in the gene's region based on their p-values.</p>4<p>Minor allele frequency among controls.</p>5<p>Per allele odds ratios +95% confidence intervals from logistic regression models adjusting for age, sex, study center, DNA source , and smoking.</p>6<p>1 d.f. trend test.</p

    Theoretic investigation of prediction performance and optimal thresholds for SNP selection in 2D PRS.

    No full text
    <p>The theoretic calculation assumes <i>M</i> = 53,163 independent SNP, of which 5,000 are causal for a binary trait, similar to simulation studies. The high-prior (HP) SNP set has 5,000 SNPs and the low-prior (LP) SNP set has 48,163 SNPs. <i>Δ</i> is the enrichment fold of HP SNPs in the causal SNP set. (A) The prediction AUC for 1D PRS and 2D PRS. (B) The optimal P-value thresholds for including HP and LP SNPs in 2D PRS. For both plots, x-coordinate is the discovery sample size, assuming equal number of cases and controls.</p

    Summary of genes in the NAD metabolism pathways used for pathway-based analysis of multi-study bladder cancer GWAS.

    No full text
    1<p>Number of SNPs genotyped in the gene region (20 kb 5′ upstream and 10 kb 3′ downstream from the gene's coding region).</p>2<p>The SNP representing the gene in the pathway analysis after the removal of SNPs with heterogeneous effects.</p>3<p>The rank of the SNP among all SNPs in the gene's region based on their p-values.</p>4<p>Minor allele frequency among controls.</p>5<p>Per allele odds ratios +95% confidence intervals from logistic regression models adjusting for age, sex, study center, DNA source , and smoking.</p>6<p>1 d.f. trend test.</p

    Genetic risk prediction for type-2 diabetes.

    No full text
    <p>PRS models were built based on the summary statistics from a meta-analysis of DIAGRAM consortium and GERA data (17,802 cases and 105,109 controls in total) and validated in independent 1500 cases and 1500 controls in GERA. (A) Prediction R<sup>2</sup> (observational scale) for 1D PRS with or without winner’s curse correction. “NO”: no winner’s correction for association coefficients; “Lasso”: regression coefficients were modified by a lasso-type correction; “MLE”: association coefficients were modified by maximizing a likelihood function conditioning on selection. (B) Quantile-quantile plot for −<i>log</i><sub>10</sub>(<i>P</i>) for high priority (HP) SNPs vs. low priority (LP) SNPs. SNPs were pruned to have pairwise <i>r</i><sup>2</sup> ≤ 0.1. Here, the HP SNPs were eSNPs/meSNPs in adipose tissue or SNPs related with the H3K4me3 mark in pancreatic islet cell line with data downloaded from the ROADMAP project. The HP SNPs were strongly enriched in the discovery data. (C) Prediction R<sup>2</sup> for 2D PRS with lasso-type winner’s curse correction. The SNP set was the same to (B). The best prediction (R<sup>2</sup> = 3.53%) was achieved when we included HP SNPs using criterion <i>P</i> ≤ 0.03 and LP SNPs with <i>P</i> ≤ 0.005. (D) The prediction R<sup>2</sup>, the area under the curve (AUC) and the significances for testing whether an alternative PRS was better than the standard 1D.</p

    Summary of genes in the Clathrin-mediated vesicle pathways used for pathway-based analysis of multi-study bladder cancer GWAS.

    No full text
    1<p>Number of SNPs genotyped in the gene region (20 kb 5′ upstream and 10 kb 3′ downstream from the gene's coding region).</p>2<p>The SNP representing the gene in the pathway analysis after the removal of SNPs with heterogeneous effects.</p>3<p>The rank of the SNP among all SNPs in the gene's region based on their p-values.</p>4<p>Minor allele frequency among controls.</p>5<p>Per allele odds ratios +95% confidence intervals from logistic regression models adjusting for age, sex, study center, DNA source, and smoking.</p>6<p>1 d.f. trend test.</p

    Comparison of polygenic risk prediction methods for 13 complex diseases.

    No full text
    <p>For all figures, the y-coordinate is the prediction R<sup>2</sup> in the observational scale. “1D” denotes 1D PRS; “2D, blood eSNPs” denotes 2D PRS using blood eSNPs as high-prior SNP set. In the x-axis, “NO” denotes PRS without winner’s curse correction; “LASSO” and “MLE” denote lasso-type and MLE winner’s curse correction, respectively. (A) Prediction R<sup>2</sup> values for six diseases in WTCCC data, estimated based on five-fold cross-validation. (B) Prediction R<sup>2</sup> values for three GWAS of cancers, estimated based on ten-fold cross-validation. (C) Prediction R<sup>2</sup> values for four complex diseases estimated based on independent validation samples.</p

    Pathways enriched with bladder cancer susceptibility loci at a <i>P</i>≤0.01 level using GSEA and ARTP.

    No full text
    <p>Results of the top ranked pathways (<i>P</i><0.01) using GSEA and ARTP. In parenthesis are results prior of removal SNPs displaying heterogeneous signals.</p>1<p>The number of genes in the pathway.</p>2<p>The number of genes underlying the enrichment signal in the pathway.</p>3<p><i>P</i>-value of the enrichment score based on 10,000 permutations.</p>4<p>False-discovery rate calculated based on the normalized statistics of the permutation data to account for the variable sizes of genes and pathways.</p

    Summary of genes in the Mitotic Metaphase/Anaphase Transition pathway used for pathway-based analysis of multi-study bladder cancer GWAS.

    No full text
    1<p>Number of SNPs genotyped in the gene region (20 kb 5′ upstream and 10 kb 3′ downstream from the gene's coding region).</p>2<p>The SNP representing the gene in the pathway analysis after the removal of SNPs with heterogeneous effects.</p>3<p>The rank of the SNP among all SNPs in the gene's region based on their p-values.</p>4<p>Minor allele frequency among controls.</p>5<p>Per allele odds ratios +95% confidence intervals from logistic regression models adjusting for age, sex, study center, DNA source , and smoking.</p>6<p>1 d.f. trend test.</p
    corecore