1 research outputs found

    Optimal sizing of a holdout set for safe predictive model updating

    Full text link
    Predictive risk scores are increasingly used to guide clinical or other interventions in complex settings, particularly healthcare. Directly updating a risk score used to guide interventions leads to biased risk estimates. We propose updating using a `holdout set' -- a subset of the population that does not receive risk-score-guided interventions -- to prevent this. Since samples in the holdout set do not benefit from risk predictions, its size must trade off performance of the updated risk score whilst minimising the number of held out samples. We prove that this approach outperforms simple alternatives, and by defining a general loss function describe conditions under which an optimal holdout size (OHS) can be readily identified. We introduce parametric and semi-parametric algorithms for OHS estimation and demonstrate their use on a recent risk score for pre-eclampsia. Based on these results, we argue that a holdout set is a safe, viable and easily implemented means to safely update predictive risk scores.Comment: Manuscript includes supplementary materials and figure
    corecore