Improvements in clinical prediction research

Abstract

This thesis aims to improve methods of clinical prediction research. In clinical prediction research, patient characteristics, test results and disease characteristics are often combined in so-called prediction models to estimate the risk that a disease or outcome is present (diagnosis) or will occur (prognosis). This thesis focuses on the derivation, validation, updating, and application of prediction models. Dealing with missing values is an underappreciated aspect in medical research. Three methods were compared that can handle missing predictor values when a prediction model is derived (complete case analysis, dropping the predictor with missing values and multiple imputation). Multiple imputation outperformed both other methods in terms of bias, coverage of the 90% confidence interval, and the discriminative ability. Similarly, six methods were compared that can handle missing predictor values when a physician applies a prediction model for an individual patient with missing predictor values. Multiple imputation proved to be best capable of improving the predictive performance of the prediction model, compared to imputation of the value zero, mean imputation, subgroup mean imputation, and applying a submodel consisting of only the observed predictors. Many prediction models are derived with dichotomous logistic regression analysis. Alternative methods are logistic regression with inherent shrinkage by penalised maximum likelihood estimation (PMLE) and genetic programming (a novel and promising search method that may improve the selection of predictors). The effect of four derivation methods was compared, namely logistic regression, logistic regression with a single shrinkage factor, logistic regression with inherent shrinkage by PMLE, and genetic programming. The performance measures of the four models were only slightly different, and the 95% confidence intervals of the areas mostly overlapped. The choice between these derivation methods should be based on the characteristics of the data and situation at hand. The predictive performance of most derived prediction models is decreased when tested in new patients. Therefore, before a prediction model can be applied in daily clinical practice, it needs to be tested (i.e. externally validated) in new patients. However, when the predictive performance is disappointing in the validation data set, the original prediction model is frequently rejected and the researchers simply pursue to build their own (new) prediction model on the data of their patients, thereby neglecting the prior information that is captured in previous studies. The alternative is to update existing prediction models. The updated models combine the information that is captured in the original model with the information of the new patients. As a result, updated models are adjusted to the new patients and thus based on data of the original and new patients, potentially increasing their generalisability. We show the effect of these updating methods with empirical data, and give recommendations for its application. This thesis ends with an overview of the promises and pitfalls of using electronic patient records (EPR) as a basis for prediction research to enhance patient care, and vice versa. The EPR are medical records in digital format that facilitate storages and retrieval of data on patient care. Though the primary aim of the EPR is to aid patient care it creates highly attractive opportunities for prediction research

Similar works

Full text

thumbnail-image

Utrecht University Repository

redirect
Last time updated on 14/06/2016

This paper was published in Utrecht University Repository.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.