223 research outputs found

    Heterogeneity in Prediction Research: methods and applications

    Get PDF
    William Osler noted in 1893 that “If it were not for the great variability between individuals, medicine might as well be a science, not an art”. In contrast, this thesis is based on the scientific paradigm that prediction models have the potential to guide medical decisions by exploiting identifiable heterogeneity across individual patients. Prediction research focuses on the development of well performing prediction models and on the assessment of their generalizability and applicability. Several methods to measure prediction model performance across clusters of patients are proposed in PART I of this thesis. PART II contains novel methods for development and validation of models that incorporate heterogeneity of treatment effect across patients. In PART III, methods for development and validation of prediction models are applied to several case studies in cardiovascular medicine, oncology, and public health

    Weighted metrics are required when evaluating the performance of prediction models in nested case–control studies

    Get PDF
    Background: Nested case–control (NCC) designs are efficient for developing and validating prediction models that use expensive or difficult-to-obtain predictors, especially when the outcome is rare. Previous research has focused on how to develop prediction models in this sampling design, but little attention has been given to model validation in this context. We therefore aimed to systematically characterize the key elements for the correct evaluation of the performance of prediction models in NCC data. Methods: We proposed how to correctly evaluate prediction models in NCC data, by adjusting performance metrics with sampling weights to account for the NCC sampling. We included in this study the C-index, threshold-based metrics, Observed-to-expected events ratio (O/E ratio), calibration slope, and decision curve analysis. We illustrated the proposed metrics with a validation of the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA version 5) in data from the population-based Rotterdam study. We compared the metrics obtained in the full cohort with those obtained in NCC datasets sampled from the Rotterdam study, with and without a matched design. Results: Performance metrics without weight adjustment were biased: the unweighted C-index in NCC datasets was 0.61 (0.58–0.63) for the unmatched design, while the C-index in the full cohort and the weighted C-index in the NCC datasets were similar: 0.65 (0.62–0.69) and 0.65 (0.61–0.69), respectively. The unweighted O/E ratio was 18.38 (17.67–19.06) in the NCC datasets, while it was 1.69 (1.42–1.93) in the full cohort and its weighted version in the NCC datasets was 1.68 (1.53–1.84). Similarly, weighted adjustments of threshold-based metrics and net benefit for decision curves were unbiased estimates of the corresponding metrics in the full cohort, while the corresponding unweighted metrics were biased. In the matched design, the bias of the unweighted metrics was larger, but it could also be compensated by the weight adjustment. Conclusions: Nested case–control studies are an efficient solution for evaluating the performance of prediction models that use expensive or difficult-to-obtain biomarkers, especially when the outcome is rare, but the performance metrics need to be adjusted to the sampling procedure.</p

    Weighted metrics are required when evaluating the performance of prediction models in nested case–control studies

    Get PDF
    Background: Nested case–control (NCC) designs are efficient for developing and validating prediction models that use expensive or difficult-to-obtain predictors, especially when the outcome is rare. Previous research has focused on how to develop prediction models in this sampling design, but little attention has been given to model validation in this context. We therefore aimed to systematically characterize the key elements for the correct evaluation of the performance of prediction models in NCC data. Methods: We proposed how to correctly evaluate prediction models in NCC data, by adjusting performance metrics with sampling weights to account for the NCC sampling. We included in this study the C-index, threshold-based metrics, Observed-to-expected events ratio (O/E ratio), calibration slope, and decision curve analysis. We illustrated the proposed metrics with a validation of the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA version 5) in data from the population-based Rotterdam study. We compared the metrics obtained in the full cohort with those obtained in NCC datasets sampled from the Rotterdam study, with and without a matched design. Results: Performance metrics without weight adjustment were biased: the unweighted C-index in NCC datasets was 0.61 (0.58–0.63) for the unmatched design, while the C-index in the full cohort and the weighted C-index in the NCC datasets were similar: 0.65 (0.62–0.69) and 0.65 (0.61–0.69), respectively. The unweighted O/E ratio was 18.38 (17.67–19.06) in the NCC datasets, while it was 1.69 (1.42–1.93) in the full cohort and its weighted version in the NCC datasets was 1.68 (1.53–1.84). Similarly, weighted adjustments of threshold-based metrics and net benefit for decision curves were unbiased estimates of the corresponding metrics in the full cohort, while the corresponding unweighted metrics were biased. In the matched design, the bias of the unweighted metrics was larger, but it could also be compensated by the weight adjustment. Conclusions: Nested case–control studies are an efficient solution for evaluating the performance of prediction models that use expensive or difficult-to-obtain biomarkers, especially when the outcome is rare, but the performance metrics need to be adjusted to the sampling procedure.</p

    Geographic and temporal validity of prediction models: different approaches were useful to examine model performance

    Get PDF
    AbstractObjectiveValidation of clinical prediction models traditionally refers to the assessment of model performance in new patients. We studied different approaches to geographic and temporal validation in the setting of multicenter data from two time periods.Study Design and SettingWe illustrated different analytic methods for validation using a sample of 14,857 patients hospitalized with heart failure at 90 hospitals in two distinct time periods. Bootstrap resampling was used to assess internal validity. Meta-analytic methods were used to assess geographic transportability. Each hospital was used once as a validation sample, with the remaining hospitals used for model derivation. Hospital-specific estimates of discrimination (c-statistic) and calibration (calibration intercepts and slopes) were pooled using random-effects meta-analysis methods. I2 statistics and prediction interval width quantified geographic transportability. Temporal transportability was assessed using patients from the earlier period for model derivation and patients from the later period for model validation.ResultsEstimates of reproducibility, pooled hospital-specific performance, and temporal transportability were on average very similar, with c-statistics of 0.75. Between-hospital variation was moderate according to I2 statistics and prediction intervals for c-statistics.ConclusionThis study illustrates how performance of prediction models can be assessed in settings with multicenter data at different time periods

    Personalized decision‑making for aneurysm treatment of aneurysmal subarachnoid hemorrhage:development and validation of a clinical prediction tool

    Get PDF
    Background: In patients with aneurysmal subarachnoid hemorrhage suitable for endovascular coiling and neurosurgical clip-reconstruction, the aneurysm treatment decision-making process could be improved by considering heterogeneity of treatment effect and durability of treatment. We aimed to develop and validate a tool to predict individualized treatment benefit of endovascular coiling compared to neurosurgical clip-reconstruction. Methods: We used randomized data (International Subarachnoid Aneurysm Trial, n = 2143) to develop models to predict 2-month functional outcome and to predict time-to-rebleed-or-retreatment. We modeled for heterogeneity of treatment effect by adding interaction terms of treatment with prespecified predictors and with baseline risk of the outcome. We predicted outcome with both treatments and calculated absolute treatment benefit. We described the patient characteristics of patients with ≥ 5% point difference in the predicted probability of favorable functional outcome (modified Rankin Score 0–2) and of no rebleed or retreatment within 10 years. Model performance was expressed with the c-statistic and calibration plots. We performed bootstrapping and leave-one-cluster-out cross-validation and pooled cluster-specific c-statistics with random effects meta-analysis. Results: The pooled c-statistics were 0.72 (95% CI: 0.69–0.75) for the prediction of 2-month favorable functional outcome and 0.67 (95% CI: 0.63–0.71) for prediction of no rebleed or retreatment within 10 years. We found no significant interaction between predictors and treatment. The average predicted benefit in favorable functional outcome was 6% (95% CI: 3–10%) in favor of coiling, but 11% (95% CI: 9–13%) for no rebleed or retreatment in favor of clip-reconstruction. 134 patients (6%), young and in favorable clinical condition, had negligible functional outcome benefit of coiling but had a ≥ 5% point benefit of clip-reconstruction in terms of durability of treatment. Conclusions: We show that young patients in favorable clinical condition and without extensive vasospasm have a negligible benefit in functional outcome of endovascular coiling – compared to neurosurgical clip-reconstruction – while at the same time having a substantially lower probability of retreatment or rebleeding from neurosurgical clip-reconstruction – compared to endovascular coiling. The SHARP prediction tool (https://sharpmodels.shinyapps.io/sharpmodels/) could support and incentivize a multidisciplinary discussion about aneurysm treatment decision-making by providing individualized treatment benefit estimates.</p

    Personalized decision‑making for aneurysm treatment of aneurysmal subarachnoid hemorrhage:development and validation of a clinical prediction tool

    Get PDF
    Background: In patients with aneurysmal subarachnoid hemorrhage suitable for endovascular coiling and neurosurgical clip-reconstruction, the aneurysm treatment decision-making process could be improved by considering heterogeneity of treatment effect and durability of treatment. We aimed to develop and validate a tool to predict individualized treatment benefit of endovascular coiling compared to neurosurgical clip-reconstruction. Methods: We used randomized data (International Subarachnoid Aneurysm Trial, n = 2143) to develop models to predict 2-month functional outcome and to predict time-to-rebleed-or-retreatment. We modeled for heterogeneity of treatment effect by adding interaction terms of treatment with prespecified predictors and with baseline risk of the outcome. We predicted outcome with both treatments and calculated absolute treatment benefit. We described the patient characteristics of patients with ≥ 5% point difference in the predicted probability of favorable functional outcome (modified Rankin Score 0–2) and of no rebleed or retreatment within 10 years. Model performance was expressed with the c-statistic and calibration plots. We performed bootstrapping and leave-one-cluster-out cross-validation and pooled cluster-specific c-statistics with random effects meta-analysis. Results: The pooled c-statistics were 0.72 (95% CI: 0.69–0.75) for the prediction of 2-month favorable functional outcome and 0.67 (95% CI: 0.63–0.71) for prediction of no rebleed or retreatment within 10 years. We found no significant interaction between predictors and treatment. The average predicted benefit in favorable functional outcome was 6% (95% CI: 3–10%) in favor of coiling, but 11% (95% CI: 9–13%) for no rebleed or retreatment in favor of clip-reconstruction. 134 patients (6%), young and in favorable clinical condition, had negligible functional outcome benefit of coiling but had a ≥ 5% point benefit of clip-reconstruction in terms of durability of treatment. Conclusions: We show that young patients in favorable clinical condition and without extensive vasospasm have a negligible benefit in functional outcome of endovascular coiling – compared to neurosurgical clip-reconstruction – while at the same time having a substantially lower probability of retreatment or rebleeding from neurosurgical clip-reconstruction – compared to endovascular coiling. The SHARP prediction tool (https://sharpmodels.shinyapps.io/sharpmodels/) could support and incentivize a multidisciplinary discussion about aneurysm treatment decision-making by providing individualized treatment benefit estimates.</p

    Integrated care in patients with atrial fibrillation- a predictive heterogeneous treatment effect analysis of the ALL-IN trial

    Get PDF
    Introduction:Integrated care is effective in reducing all-cause mortality in patients with atrial fibrillation (AF) in primary care, though time and resource intensive. The aim of the current study was to assess whether integrated care should be directed at all AF patients equally. Methods:The ALL-IN trial (n = 1,240 patients, median age 77 years) was a cluster-randomized trial in which primary care practices were randomized to provide integrated care or usual care to AF patients aged 65 years and older. Integrated care comprised of (i) anticoagulation monitoring, (ii) quarterly checkups and (iii) easy-access consultation with cardiologists. For the current analysis, cox proportional hazard analysis with all clinical variables from the CHA2DS2-VASc score was used to predict all-cause mortality in the ALL-IN trial. Subsequently, the hazard ratio and absolute risk reduction were plotted as a function of this predicted mortality risk to explore treatment heterogeneity. Results:Under usual care, after a median of 2 years follow-up the absolute risk of all-cause mortality in the highest-risk quarter was 31.0%, compared to 4.6% in the lowest-risk quarter. On the relative scale, there was no evidence of treatment heterogeneity (p for interaction = 0.90). However, there was substantial treatment heterogeneity on the absolute scale: risk reduction in the lowest risk- quarter of risk 3.3% (95% CI -0.4% - 7.0) compared to 12.0% (95% CI 2.7% - 22.0) in the highest risk quarter. Conclusion:While the relative degree of benefit from integrated AF care is similar in all patients, patients with a high all-cause mortality risk have a greater benefit on an absolute scale and should therefore be prioritized when implementing integrated care.</p

    Differences between Men and Women in Treatment and Outcome after Traumatic Brain Injury

    Get PDF
    Traumatic brain injury (TBI) is a significant cause of disability, but little is known about sex and gender differences after TBI. We aimed to analyze the association between sex/gender, and the broad range of care pathways, treatment characteristics, and outcomes following mild and moderate/severe TBI. We performed mixed-effects regression analyses in the prospective multi-center Collaborative European NeuroTrauma Effectiveness Research in Traumatic Brain Injury (CENTER-TBI) study, stratified for injury severity and age, and adjusted for baseline characteristics. Outcomes were various care pathway and treatment variables, and 6-month measures of functional outcome, health-related quality of life (HRQoL), post-concussion symptoms (PCS), and mental health symptoms. The study included 2862 adults (36% women) with mild (mTBI; Glasgow Coma Scale [GCS] score 13-15), and 1333 adults (26% women) with moderate/severe TBI (GCS score 3-12). Women were less likely to be admitted to the intensive care unit (ICU; odds ratios [OR] 0.6, 95% confidence interval [CI]: 0.4-0.8) following mTBI. Following moderate/severe TBI, women had a shorter median hospital stay (OR 0.7, 95% CI: 0.5-1.0). Following mTBI, women had poorer outcomes; lower Glasgow Outcome Scale Extended (GOSE; OR 1.4, 95% CI: 1.2-1.6), lower generic and disease-specific HRQoL, and more severe PCS, depression, and anxiety. Among them, women under age 45 and above age 65 years showed worse 6-month outcomes compared with men of the same age. Following moderate/severe TBI, there was no difference in GOSE (OR 0.9, 95% CI: 0.7-1.2), but women reported more severe PCS (OR 1.7, 95% CI: 1.1-2.6). Men and women differ in care pathways and outcomes following TBI. Women generally report worse 6-month outcomes, but the size of differences depend on TBI severity and age. Future studies should examine factors that explain these differences.Peer reviewe

    Prediction of Global Functional Outcome and Post-Concussive Symptoms after Mild Traumatic Brain Injury : External Validation of Prognostic Models in the Collaborative European NeuroTrauma Effectiveness Research in Traumatic Brain Injury (CENTER-TBI) Study

    Get PDF
    The majority of traumatic brain injuries (TBIs) are categorized as mild, according to a baseline Glasgow Coma Scale (GCS) score of 13-15. Prognostic models that were developed to predict functional outcome and persistent post-concussive symptoms (PPCS) after mild TBI have rarely been externally validated. We aimed to externally validate models predicting 3-12-month Glasgow Outcome Scale Extended (GOSE) or PPCS in adults with mild TBI. We analyzed data from the Collaborative European NeuroTrauma Effectiveness Research in Traumatic Brain Injury (CENTER-TBI) project, which included 2862 adults with mild TBI, with 6-month GOSE available for 2374 and Rivermead Post-Concussion Symptoms Questionnaire (RPQ) results available for 1605 participants. Model performance was evaluated based on calibration (graphically and characterized by slope and intercept) and discrimination (C-index). We validated five published models for 6-month GOSE and three for 6-month PPCS scores. The models used different cutoffs for outcome and some included symptoms measured 2 weeks post-injury. Discriminative ability varied substantially (C-index between 0.58 and 0.79). The models developed in the Corticosteroid Randomisation After Significant Head Injury (CRASH) trial for prediction of GOSEPeer reviewe
    • …
    corecore