966 research outputs found

    Estimation of required sample size for external validation of risk models for binary outcomes

    Get PDF
    Risk-prediction models for health outcomes are used in practice as part of clinical decision-making, and it is essential that their performance be externally validated. An important aspect in the design of a validation study is choosing an adequate sample size. In this paper, we investigate the sample size requirements for validation studies with binary outcomes to estimate measures of predictive performance (C-statistic for discrimination and calibration slope and calibration in the large). We aim for sufficient precision in the estimated measures. In addition, we investigate the sample size to achieve sufficient power to detect a difference from a target value. Under normality assumptions on the distribution of the linear predictor, we obtain simple estimators for sample size calculations based on the measures above. Simulation studies show that the estimators perform well for common values of the C-statistic and outcome prevalence when the linear predictor is marginally Normal. Their performance deteriorates only slightly when the normality assumptions are violated. We also propose estimators which do not require normality assumptions but require specification of the marginal distribution of the linear predictor and require the use of numerical integration. These estimators were also seen to perform very well under marginal normality. Our sample size equations require a specified standard error (SE) and the anticipated C-statistic and outcome prevalence. The sample size requirement varies according to the prognostic strength of the model, outcome prevalence, choice of the performance measure and study objective. For example, to achieve an SE < 0.025 for the C-statistic, 60-170 events are required if the true C-statistic and outcome prevalence are between 0.64-0.85 and 0.05-0.3, respectively. For the calibration slope and calibration in the large, achieving SE < 0.15   would require 40-280 and 50-100 events, respectively. Our estimators may also be used for survival outcomes when the proportion of censored observations is high

    Predictive validity of the CriSTAL tool for short-term mortality in older people presenting at Emergency Departments: a prospective study

    Get PDF
    © 2018, The Author(s). Abstract: To determine the validity of the Australian clinical prediction tool Criteria for Screening and Triaging to Appropriate aLternative care (CRISTAL) based on objective clinical criteria to accurately identify risk of death within 3 months of admission among older patients. Methods: Prospective study of ≥ 65 year-olds presenting at emergency departments in five Australian (Aus) and four Danish (DK) hospitals. Logistic regression analysis was used to model factors for death prediction; Sensitivity, specificity, area under the ROC curve and calibration with bootstrapping techniques were used to describe predictive accuracy. Results: 2493 patients, with median age 78–80 years (DK–Aus). The deceased had significantly higher mean CriSTAL with Australian mean of 8.1 (95% CI 7.7–8.6 vs. 5.8 95% CI 5.6–5.9) and Danish mean 7.1 (95% CI 6.6–7.5 vs. 5.5 95% CI 5.4–5.6). The model with Fried Frailty score was optimal for the Australian cohort but prediction with the Clinical Frailty Scale (CFS) was also good (AUROC 0.825 and 0.81, respectively). Values for the Danish cohort were AUROC 0.764 with Fried and 0.794 using CFS. The most significant independent predictors of short-term death in both cohorts were advanced malignancy, frailty, male gender and advanced age. CriSTAL’s accuracy was only modest for in-hospital death prediction in either setting. Conclusions: The modified CriSTAL tool (with CFS instead of Fried’s frailty instrument) has good discriminant power to improve prognostic certainty of short-term mortality for ED physicians in both health systems. This shows promise in enhancing clinician’s confidence in initiating earlier end-of-life discussions

    Minimum sample size for external validation of a clinical prediction model with a binary outcome

    Get PDF
    In prediction model research, external validation is needed to examine an existing model's performance using data independent to that for model development. Current external validation studies often suffer from small sample sizes and consequently imprecise predictive performance estimates. To address this, we propose how to determine the minimum sample size needed for a new external validation study of a prediction model for a binary outcome. Our calculations aim to precisely estimate calibration (Observed/Expected and calibration slope), discrimination (C-statistic), and clinical utility (net benefit). For each measure, we propose closed-form and iterative solutions for calculating the minimum sample size required. These require specifying: (i) target SEs (confidence interval widths) for each estimate of interest, (ii) the anticipated outcome event proportion in the validation population, (iii) the prediction model's anticipated (mis)calibration and variance of linear predictor values in the validation population, and (iv) potential risk thresholds for clinical decision-making. The calculations can also be used to inform whether the sample size of an existing (already collected) dataset is adequate for external validation. We illustrate our proposal for external validation of a prediction model for mechanical heart valve failure with an expected outcome event proportion of 0.018. Calculations suggest at least 9835 participants (177 events) are required to precisely estimate the calibration and discrimination measures, with this number driven by the calibration slope criterion, which we anticipate will often be the case. Also, 6443 participants (116 events) are required to precisely estimate net benefit at a risk threshold of 8%. Software code is provided.</p

    Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Decision curve analysis is a novel method for evaluating diagnostic tests, prediction models and molecular markers. It combines the mathematical simplicity of accuracy measures, such as sensitivity and specificity, with the clinical applicability of decision analytic approaches. Most critically, decision curve analysis can be applied directly to a data set, and does not require the sort of external data on costs, benefits and preferences typically required by traditional decision analytic techniques.</p> <p>Methods</p> <p>In this paper we present several extensions to decision curve analysis including correction for overfit, confidence intervals, application to censored data (including competing risk) and calculation of decision curves directly from predicted probabilities. All of these extensions are based on straightforward methods that have previously been described in the literature for application to analogous statistical techniques.</p> <p>Results</p> <p>Simulation studies showed that repeated 10-fold crossvalidation provided the best method for correcting a decision curve for overfit. The method for applying decision curves to censored data had little bias and coverage was excellent; for competing risk, decision curves were appropriately affected by the incidence of the competing risk and the association between the competing risk and the predictor of interest. Calculation of decision curves directly from predicted probabilities led to a smoothing of the decision curve.</p> <p>Conclusion</p> <p>Decision curve analysis can be easily extended to many of the applications common to performance measures for prediction models. Software to implement decision curve analysis is provided.</p

    Validation and Recalibration of Two Multivariable Prognostic Models for Survival and Independence in Acute Stroke

    Get PDF
    Introduction Various prognostic models have been developed for acute stroke, including one based on age and five binary variables (‘six simple variables’ model; SSVMod) and one based on age plus scores on the National Institutes of Health Stroke Scale (NIHSSMod). The aims of this study were to externally validate and recalibrate these models, and to compare their predictive ability in relation to both survival and independence. Methods Data from a large clinical trial of oxygen therapy (n = 8003) were used to determine the discrimination and calibration of the models, using C-statistics, calibration plots, and Hosmer-Lemeshow statistics. Methods of recalibration in the large and logistic recalibration were used to update the models. Results For discrimination, both models functioned better for survival (C-statistics between .802 and .837) than for independence (C-statistics between .725 and .735). Both models showed slight shortcomings with regard to calibration, over-predicting survival and under-predicting independence; the NIHSSMod performed slightly better than the SSVMod. For the most part, there were only minor differences between ischaemic and haemorrhagic strokes. Logistic recalibration successfully updated the models for a clinical trial population. Conclusions Both prognostic models performed well overall in a clinical trial population. The choice between them is probably better based on clinical and practical considerations than on statistical considerations

    Personalized Prediction of Lifetime Benefits with Statin Therapy for Asymptomatic Individuals: A Modeling Study

    Get PDF
    Background: Physicians need to inform asymptomatic individuals about personalized outcomes of statin therapy for primary prevention of cardiovascular disease (CVD). However, current prediction models focus on short-term outcomes and ignore the competing risk of death due to other causes. We aimed to predict the potential lifetime benefits with statin therapy, taking into account competing risks. Methods and Findings: A microsimulation model based on 5-y follow-up data from the Rotterdam Study, a population-based cohort of individuals aged 55 y and older living in the Ommoord district of Rotterdam, the Netherlands, was used to estimate lifetime outcomes with and without statin therapy. The model was validated in-sample using 10-y follow-up data. We used baseline variables and model output to construct (1) a web-based calculator for gains in total and CVD-free life expectancy and (2) color charts for comparing these gains to the Systematic Coronary Risk Evaluation (SCORE) charts. In 2,428 participants (mean age 67.7 y, 35.5% men), statin therapy increased total life expectancy by 0.3 y (SD 0.2) and CVD-free life expectancy by 0.7 y (SD 0.4). Age, sex, smoking, blood pressure, hypertension, lipids, diabetes, glucose, body mass index, waist-to-hip ratio, and creatinine were included in the calculator. Gains in total and CVD-free life expectancy increased with blood pressure, unfavorable lipid levels, and body mass index after multivariable adjustment. Gains decreased considerably with advancing age, while SCORE 10-y CVD mortality risk increased with age. Twenty-five percent of participants with a low SCORE risk achieved equal or larger gains in CVD-free life expectancy than the median gain in participants with a high SCORE risk. Conclusions: We developed tools to predict personalized increases in total and CVD-free life expectancy with statin therapy. The predicted gains we found are small. If the underlying model is validated in an independent cohort, the tools may be useful in discussing with patients their individual outcomes with statin therapy. Please see later in the article for the Editors' Summar

    Prediction of intracranial findings on CT-scans by alternative modelling techniques

    Get PDF
    Background: Prediction rules for intracranial traumatic findings in patients with minor head injury are designed to reduce the use of computed tomography (CT) without missing patients at risk for complications. This study investigates whether alternative modelling techniques might improve the applicability and simplicity of such prediction rules. Methods. We included 3181 patients with minor head injury who had received CT scans be

    Predictive Value of Updating Framingham Risk Scores with Novel Risk Markers in the U.S. General Population

    Get PDF
    Background: According to population-based cohort studies CT coronary calcium score (CTCS), carotid intima-media thickness (cIMT), high-sensitivity C- reactive protein (CRP), and ankle-brachial index (ABI) are promising novel risk markers for improving cardiovascular risk assessment. Their impact in the U.S. general population is however uncertain. Our aim was to estimate the predictive value of four novel cardiovascular risk markers for the U.S. general population. Methods and Findings: Risk profiles, CRP and ABI data of 3,736 asymptomatic subjects aged 40 or older from the National Health and Nutrition Examination Survey (NHANES) 2003–2004 exam were used along with predicted CTCS and cIMT values. For each subject, we calculated 10-year cardiovascular risks with and without each risk marker. Event rates adjusted for competing risks were obtained by microsimulation. We assessed the impact of updated 10-year risk scores by reclassification and C-statistics. In the study population (mean age 56±11 years, 48% male), 70% (80%) were at low (<10%), 19% (14%) at intermediate (≥10–<20%), and 11% (6%) at high (≥20%) 10-year CVD (CHD) risk. Net reclassification improvement was highest after updating 10-year CVD risk with CTCS: 0.10 (95%CI 0.02–0.19). The C-statistic for 10-year CVD risk increased from 0.82 by 0.02 (95%CI 0.01–0.03) with CTCS. Reclassification occurred most often in those at intermediate risk: with CTCS, 36% (38%) moved to low and 22% (30%) to high CVD (CHD) risk. Improvements with other novel risk markers were limited. Conclusions: Only CTCS appeared to have significant incremental predictive value in the U.S. general population, especially in those at intermediate risk. In future research, cost-effectiveness analyses should be considered for evaluating novel cardiovascular risk assessment strategies
    • …
    corecore