Uncertainty in Lung Cancer Stage for Outcome Estimation via Set-Valued Classification

Abstract

Difficulty in identifying cancer stage in health care claims data has limited oncology quality of care and health outcomes research. We fit prediction algorithms for classifying lung cancer stage into three classes (stages I/II, stage III, and stage IV) using claims data, and then demonstrate a method for incorporating the classification uncertainty in outcomes estimation. Leveraging set-valued classification and split conformal inference, we show how a fixed algorithm developed in one cohort of data may be deployed in another, while rigorously accounting for uncertainty from the initial classification step. We demonstrate this process using SEER cancer registry data linked with Medicare claims data.Comment: Code available at: https://github.com/sl-bergquist/cancer_classificatio

    Similar works

    Full text

    thumbnail-image

    Available Versions