95 research outputs found

    Incorporating Novel Risk Markers into Established Risk Prediction Models

    Get PDF
    Introduction: Risk prediction models are used as part of formal risk assessment for disease and health events in UK primary care. To improve the accuracy of risk prediction, new risk factors are being added to established risk prediction models. However, current methods used to evaluate the added value of these new risk factors have shown to be limited. These limitations can be addressed using health economic methodology, but is yet to be used to evaluate and compare risk prediction models by means of their effectiveness and cost. Methods: A cost effectiveness analysis was performed using a decision tree framework. The decision tree was populated risk model effects and cost measures. The cost-effectiveness analysis derived the incremental cost effectiveness ratio (ICER) using the Youden Index and Harrell’s C-Index performance measures, and the net monetary benefit (INB). A probabilistic sensitivity analysis was performed, based on 10,000 iterations. A range of £0-£100,000 was used for the willingness to pay (WTP), which when combined with the INB, provided the probability the new risk factor was cost effective. This method was applied in two exemplar prospective cohort studies; adding family history (FH) to cardiovascular disease (CVD) risk prediction; and bone mineral density (BMD) to fracture risk prediction. Results: A cost-effectiveness analysis using a decision tree framework was shown to be an effective way of evaluating the added value of the new risk factor. Adding FH to standard CVD risk factors produced an ICER of £799.91 (-£5,962.15 to £5,968.22) and £7,788.76 (-£42,760.16 to £48,962.39) per percentage unit increase in the Youden Index and Harrell’s C-Index, respectively. The maximum probability of FH being cost effective is 0.7, with a minimum WTP of £15,000 (Youden Index). Further, treating low risk patients with statin therapy incorrectly was less costly (£788.40) than not treating them (£916.16). Adding continuous BMD measurement to standard fracture risk factors produced an ICER of £367.25 (-£4,241.88 to £4,828.50) and £4,480.54 (-£22,816.84 to £22,970.55) per percentage unit increase in the Youden Index and Harrell’s C-Index, respectively. The maximum probability BMD being cost-effective is 0.8, with a minimum WTP of £32,500 (Youden Index). Further, using BMD in a binary format to indicate osteoporotic patients, did not improve Harrell’s C-Index of standard fracture risk prediction (∆C-Index=-0.62%). Conclusion: A cost-effectiveness analysis was a novel method to compare two risk prediction models; and to evaluate the added value of a new risk factor. It identifies the added value of a new risk factor; encompassing the statistical and clinical improvement, and cost consequences when using the new risk factor in an established risk prediction model. Based on the added value of FH and BMD, there is a good evidence base to add these risk factors into routine risk assessment of the respective conditions. Increased use of this method could help standardise risk prediction and increase comparability of risk prediction models within diseases; producing a league table approach to evaluate, appraise and identify beneficial new risk factors and better risk prediction models

    Does bone mineral density improve the predictive accuracy of fracture risk assessment?: a prospective cohort study in Northern Denmark

    Get PDF
    Objective: To evaluate the added predictive accuracy of bone mineral density (BMD) to fracture risk assessment.Design Prospective cohort study using data between 01 January 2010 and 31 December 2012. Setting: North Denmark Osteoporosis Clinic of referred patients presenting with at least one fracture risk factor to the referring doctor.Participants Patients aged 40–90 years; had BMD T-score recorded at the hip and not taking osteoporotic preventing drugs for more than 1 year prior to baseline. Main outcome measures: Incident diagnoses of osteoporotic fractures (hip, spine, forearm, humerus and pelvis) were identified using the National Patient Registry of Denmark during 01 January 2012–01 January 2014. Cox regression was used to develop a fracture model based on predictors in the Fracture Risk Assessment Tool (FRAX®), with and without, binary and continuous BMD. Change in Harrell’s C-Index and Reclassification tables were used to describe the added statistical value of BMD. Results: Adjusting for predictors included in FRAX®, patients with osteoporosis (T-score ≤−2.5) had 75% higher hazard of a fracture compared with patients with higher BMD (HR: 1.75 (95% CI 1.28 to 2.38)). Forty per cent lower hazard was found per unit increase in continuous BMD T-score (HR: 0.60 (95% CI 0.52 to 0.69)).Accuracy improved marginally, and Harrell’s C-Index increased by 1.2% when adding continuous BMD (0.76 to 0.77). Reclassification tables showed continuous BMD shifted 529 patients into different risk categories; 292 of these were reclassified correctly (57%; 95% CI 55% to 64%). Adding binary BMD however no improvement: Harrell’s C-Index decreased by 0.6%. Conclusions: Continuous BMD marginally improves fracture risk assessment. Importantly, this was only found when using continuous BMD measurement for osteoporosis. It is suggested that future focus should be on evaluation of this risk factor using routinely collected data and on the development of more clinically relevant methodology to assess the added value of a new risk factor

    Antidepressant use and risk of adverse outcomes in older people: population based cohort study

    Get PDF
    Objectives To investigate the association between antidepressant treatment and risk of several potential adverse outcomes in older people with depression and to examine risks by class of antidepressant, duration of use, and dose

    Calculating the power of a planned individual participant data meta‐analysis to examine prognostic factor effects for a binary outcome

    Get PDF
    Collecting data for an individual participant data meta‐analysis (IPDMA) project can be time consuming and resource intensive and could still have insufficient power to answer the question of interest. Therefore, researchers should consider the power of their planned IPDMA before collecting IPD. Here we propose a method to estimate the power of a planned IPDMA project aiming to synthesise multiple cohort studies to investigate the (unadjusted or adjusted) effects of potential prognostic factors for a binary outcome. We consider both binary and continuous factors and provide a three‐step approach to estimating the power in advance of collecting IPD, under an assumption of the true prognostic effect of each factor of interest. The first step uses routinely available (published) aggregate data for each study to approximate Fisher's information matrix and thereby estimate the anticipated variance of the unadjusted prognostic factor effect in each study. These variances are then used in step 2 to estimate the anticipated variance of the summary prognostic effect from the IPDMA. Finally, step 3 uses this variance to estimate the corresponding IPDMA power, based on a two‐sided Wald test and the assumed true effect. Extensions are provided to adjust the power calculation for the presence of additional covariates correlated with the prognostic factor of interest (by using a variance inflation factor) and to allow for between‐study heterogeneity in prognostic effects. An example is provided for illustration, and Stata code is supplied to enable researchers to implement the method

    Artificial intelligence in lung cancer diagnostic imaging: a review of the reporting and conduct of research published 2018–2019

    Get PDF
    Objective: This study aimed to describe the methodologies used to develop and evaluate models that use artificial intelligence (AI) to analyse lung images in order to detect, segment (outline borders of), or classify pulmonary nodules as benign or malignant. Methods: In October 2019, we systematically searched the literature for original studies published between 2018 and 2019 that described prediction models using AI to evaluate human pulmonary nodules on diagnostic chest images. Two evaluators independently extracted information from studies, such as study aims, sample size, AI type, patient characteristics, and performance. We summarised data descriptively. Results: The review included 153 studies: 136 (89%) development-only studies, 12 (8%) development and validation, and 5 (3%) validation-only. CT scans were the most common type of image type used (83%), often acquired from public databases (58%). Eight studies (5%) compared model outputs with biopsy results. 41 studies (26.8%) reported patient characteristics. The models were based on different units of analysis, such as patients, images, nodules, or image slices or patches. Conclusion: The methods used to develop and evaluate prediction models using AI to detect, segment, or classify pulmonary nodules in medical imaging vary, are poorly reported, and therefore difficult to evaluate. Transparent and complete reporting of methods, results and code would fill the gaps in information we observed in the study publications. Advances in knowledge: We reviewed the methodology of AI models detecting nodules on lung images and found that the models were poorly reported and had no description of patient characteristics, with just a few comparing models’ outputs with biopsies results. When lung biopsy is not available, lung-RADS could help standardise the comparisons between the human radiologist and the machine. The field of radiology should not give up principles from the diagnostic accuracy studies, such as the choice for the correct ground truth, just because AI is used. Clear and complete reporting of the reference standard used would help radiologists trust in the performance that AI models claim to have. This review presents clear recommendations about the essential methodological aspects of diagnostic models that should be incorporated in studies using AI to help detect or segmentate lung nodules. The manuscript also reinforces the need for more complete and transparent reporting, which can be helped using the recommended reporting guidelines

    Clinical prediction models and the multiverse of madness

    Get PDF
    Background Each year, thousands of clinical prediction models are developed to make predictions (e.g. estimated risk) to inform individual diagnosis and prognosis in healthcare. However, most are not reliable for use in clinical practice. Main body We discuss how the creation of a prediction model (e.g. using regression or machine learning methods) is dependent on the sample and size of data used to develop it—were a different sample of the same size used from the same overarching population, the developed model could be very different even when the same model development methods are used. In other words, for each model created, there exists a multiverse of other potential models for that sample size and, crucially, an individual’s predicted value (e.g. estimated risk) may vary greatly across this multiverse. The more an individual’s prediction varies across the multiverse, the greater the instability. We show how small development datasets lead to more different models in the multiverse, often with vastly unstable individual predictions, and explain how this can be exposed by using bootstrapping and presenting instability plots. We recommend healthcare researchers seek to use large model development datasets to reduce instability concerns. This is especially important to ensure reliability across subgroups and improve model fairness in practice. Conclusions Instability is concerning as an individual’s predicted value is used to guide their counselling, resource prioritisation, and clinical decision making. If different samples lead to different models with very different predictions for the same individual, then this should cast doubt into using a particular model for that individual. Therefore, visualising, quantifying and reporting the instability in individual-level predictions is essential when proposing a new model

    Poor handling of continuous predictors in clinical prediction models using logistic regression: a systematic review

    Get PDF
    Background and Objectives When developing a clinical prediction model, assuming a linear relationship between the continuous predictors and outcome is not recommended. Incorrect specification of the functional form of continuous predictors could reduce predictive accuracy. We examine how continuous predictors are handled in studies developing a clinical prediction model. Methods We searched PubMed for clinical prediction model studies developing a logistic regression model for a binary outcome, published between July 01, 2020, and July 30, 2020. Results In total, 118 studies were included in the review (18 studies (15%) assessed the linearity assumption or used methods to handle nonlinearity, and 100 studies (85%) did not). Transformation and splines were commonly used to handle nonlinearity, used in 7 (n = 7/18, 39%) and 6 (n = 6/18, 33%) studies, respectively. Categorization was most often used method to handle continuous predictors (n = 67/118, 56.8%) where most studies used dichotomization (n = 40/67, 60%). Only ten models included nonlinear terms in the final model (n = 10/18, 56%). Conclusion Though widely recommended not to categorize continuous predictors or assume a linear relationship between outcome and continuous predictors, most studies categorize continuous predictors, few studies assess the linearity assumption, and even fewer use methodology to account for nonlinearity. Methodological guidance is provided to guide researchers on how to handle continuous predictors when developing a clinical prediction model

    Poor handling of continuous predictors in clinical prediction models using logistic regression:a systematic review

    Get PDF
    Background and Objectives When developing a clinical prediction model, assuming a linear relationship between the continuous predictors and outcome is not recommended. Incorrect specification of the functional form of continuous predictors could reduce predictive accuracy. We examine how continuous predictors are handled in studies developing a clinical prediction model. Methods We searched PubMed for clinical prediction model studies developing a logistic regression model for a binary outcome, published between July 01, 2020, and July 30, 2020. Results In total, 118 studies were included in the review (18 studies (15%) assessed the linearity assumption or used methods to handle nonlinearity, and 100 studies (85%) did not). Transformation and splines were commonly used to handle nonlinearity, used in 7 (n = 7/18, 39%) and 6 (n = 6/18, 33%) studies, respectively. Categorization was most often used method to handle continuous predictors (n = 67/118, 56.8%) where most studies used dichotomization (n = 40/67, 60%). Only ten models included nonlinear terms in the final model (n = 10/18, 56%). Conclusion Though widely recommended not to categorize continuous predictors or assume a linear relationship between outcome and continuous predictors, most studies categorize continuous predictors, few studies assess the linearity assumption, and even fewer use methodology to account for nonlinearity. Methodological guidance is provided to guide researchers on how to handle continuous predictors when developing a clinical prediction model

    Sample size requirements are not being considered in studies developing prediction models for binary outcomes: a systematic review

    Get PDF
    Background Having an appropriate sample size is important when developing a clinical prediction model. We aimed to review how sample size is considered in studies developing a prediction model for a binary outcome. Methods We searched PubMed for studies published between 01/07/2020 and 30/07/2020 and reviewed the sample size calculations used to develop the prediction models. Using the available information, we calculated the minimum sample size that would be needed to estimate overall risk and minimise overfitting in each study and summarised the difference between the calculated and used sample size. Results A total of 119 studies were included, of which nine studies provided sample size justification (8%). The recommended minimum sample size could be calculated for 94 studies: 73% (95% CI: 63–82%) used sample sizes lower than required to estimate overall risk and minimise overfitting including 26% studies that used sample sizes lower than required to estimate overall risk only. A similar number of studies did not meet the ≥ 10EPV criteria (75%, 95% CI: 66–84%). The median deficit of the number of events used to develop a model was 75 [IQR: 234 lower to 7 higher]) which reduced to 63 if the total available data (before any data splitting) was used [IQR:225 lower to 7 higher]. Studies that met the minimum required sample size had a median c-statistic of 0.84 (IQR:0.80 to 0.9) and studies where the minimum sample size was not met had a median c-statistic of 0.83 (IQR: 0.75 to 0.9). Studies that met the ≥ 10 EPP criteria had a median c-statistic of 0.80 (IQR: 0.73 to 0.84). Conclusions Prediction models are often developed with no sample size calculation, as a consequence many are too small to precisely estimate the overall risk. We encourage researchers to justify, perform and report sample size calculations when developing a prediction model

    Evaluation of clinical prediction models (part 2): how to undertake an external validation study

    Get PDF
    External validation studies are an important but often neglected part of prediction model research. In this article, the second in a series on model evaluation, Riley and colleagues explain what an external validation study entails and describe the key steps involved, from establishing a high quality dataset to evaluating a model’s predictive performance and clinical usefulness
    corecore