17 research outputs found

    Accurate and robust genomic prediction of celiac disease using statistical learning.

    Get PDF
    Practical application of genomic-based risk stratification to clinical diagnosis is appealing yet performance varies widely depending on the disease and genomic risk score (GRS) method. Celiac disease (CD), a common immune-mediated illness, is strongly genetically determined and requires specific HLA haplotypes. HLA testing can exclude diagnosis but has low specificity, providing little information suitable for clinical risk stratification. Using six European cohorts, we provide a proof-of-concept that statistical learning approaches which simultaneously model all SNPs can generate robust and highly accurate predictive models of CD based on genome-wide SNP profiles. The high predictive capacity replicated both in cross-validation within each cohort (AUC of 0.87-0.89) and in independent replication across cohorts (AUC of 0.86-0.9), despite differences in ethnicity. The models explained 30-35% of disease variance and up to ∼43% of heritability. The GRS's utility was assessed in different clinically relevant settings. Comparable to HLA typing, the GRS can be used to identify individuals without CD with β‰₯99.6% negative predictive value however, unlike HLA typing, fine-scale stratification of individuals into categories of higher-risk for CD can identify those that would benefit from more invasive and costly definitive testing. The GRS is flexible and its performance can be adapted to the clinical situation by adjusting the threshold cut-off. Despite explaining a minority of disease heritability, our findings indicate a genomic risk score provides clinically relevant information to improve upon current diagnostic pathways for CD and support further studies evaluating the clinical utility of this approach in CD and other complex diseases

    Genomic prediction of coronary heart disease

    Get PDF
    Aims Genetics plays an important role in coronary heart disease (CHD) but the clinical utility of genomic risk scores (GRSs) relative to clinical risk scores, such as the Framingham Risk Score (FRS), is unclear. Our aim was to construct and externally validate a CHD GRS, in terms of lifetime CHD risk and relative to traditional clinical risk scores. Methods and results We generated a GRS of 49 310 SNPs based on a CARDIoGRAMplusC4D Consortium meta-analysis of CHD, then independently tested it using five prospective population cohorts (three FINRISK cohorts, combined n = 12 676, 757 incident CHD events; two Framingham Heart Study cohorts (FHS), combined n = 3406, 587 incident CHD events). The GRS was associated with incident CHD (FINRISK HR = 1.74, 95% confidence interval (CI) 1.61-1.86 per S.D. of GRS; Framingham HR = 1.28, 95% CI 1.18-1.38), and was largely unchanged by adjustment for known risk factors, including family history. Integration of the GRS with the FRS or ACC/AHA13 scores improved the 10 years risk prediction (meta-analysis C-index: +1.5-1.6%, P = 60 years old (meta-analysis C-index: +4.6-5.1%, P <0.001). Importantly, the GRS captured substantially different trajectories of absolute risk, with men in the top 20% of attaining 10% cumulative CHD risk 12-18 y earlier than those in the bottom 20%. High genomic risk was partially compensated for by low systolic blood pressure, low cholesterol level, and non-smoking. Conclusions A GRS based on a large number of SNPs improves CHD risk prediction and encodes different trajectories of lifetime risk not captured by traditional clinical risk scores.Peer reviewe

    Example clinical scenarios.

    No full text
    <p>The GRS can be employed in different clinical scenarios and tuned to optimize outcomes. The GRS can be employed in a comparable manner to HLA testing (left table) to confidently exclude CD. In this scenario, we selected a GRS threshold based on NPVβ€Š=β€Š99.6% however a range of thresholds can be selected to achieve a high NPV (see note below). The GRS can also stratify CD risk (right table). Confirmatory testing (such as small bowel biopsy) would be reserved for those at high-risk. In this example, we present two scenarios: optimization of PPV or of sensitivity. In comparison to the GRS, all HLA-susceptible patients will need to undergo further confirmatory testing for CD. For more information on GRS performance across a range of thresholds, see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004137#pgen.1004137.s007" target="_blank">Table S2</a>. Prospective validation of the GRS in local populations would enable the most appropriate settings for NPV, PPV and sensitivity to be identified which provide the optimal diagnostic outcomes. <sup>+</sup> The highest achievable NPV at 10% prevalence was 99.4%.</p

    Distribution of genomic risk scores in cases and controls.

    No full text
    <p>(a) Kernel density estimates of the risk scores predicted using models on UK2 and tested in the combined dataset Finn+NL+IT, for cases and controls. (b) Thresholds for risk scores in terms of population percent, with the top more likely to be a CD and the bottom more likely to be non-CD.</p

    Building genomic models predictive of celiac disease.

    No full text
    <p>LOESS-smoothed (a) AUC and (b) phenotypic variance explained, from 10Γ—10 cross-validation, with differing model sizes, within each celiac dataset. The grey bands represent 95% confidence intervals about the mean LOESS smooth.</p

    Performance of the genomic risk score in external validation, when compared to other approaches, and on other related diseases.

    No full text
    <p>ROC curves for models trained in the UK2 dataset and tested on (a) four other CD datasets, (b) the Immunochip CD dataset, comparing the GRS approach with that of Romanos et al. <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004137#pgen.1004137-Romanos1" target="_blank">[21]</a>, and (c) three other autoimmune diseases (Crohn's disease, Rheumatoid Arthritis, and Type 1 Diabetes). We did not re-tune the models on the test data. For (b) and (c), we used a reduced set of SNPs for training, from the intersection of the UK2 SNPs with the Immunochip or WTCCC SNPs (18,252 SNPs and 76,847 SNPs, respectively). In (c), the same reduced set of SNPs was used for the CD-Finn dataset, in order to maintain the same SNPs across all target datasets.</p

    Clinical interpretation as a function of threshold and prevalence.

    No full text
    <p>The number of non-CD cases β€œmisdiagnosed” (wrongly implicated by GRS) per true CD cases β€œdiagnosed” (correctly implicated by GRS), for different levels of sensitivity. The risk score is based on a model trained on the UK2 dataset, and tested on the combined Finn+NL+IT dataset. The results were threshold-averaged over 50 independent replications. Note that the curve for <i>K</i>β€Š=β€Š1% does not span the entire range due to averaging over a small number of cases in that dataset.</p
    corecore