Using Health Economics Tools to Enhance the Clinical Utility of Artificial Intelligence-Based Diagnostics: A Case Study in Breast Cancer Screening

Abstract

Thesis (Ph.D.)--University of Washington, 2020Researchers in artificial intelligence (AI) have recently produced several products for medical diagnosis that perform at the same level as human clinicians. Artificial intelligence products will also need to be developed that will be trusted by clinicians and are known to produce positive effects in patients. One important area where AI may be applied is breast cancer screening which, despite its benefits, currently harms many women through false positives and overdiagnosis. This dissertation involved the use of two tools from health economics – discrete choice experiments and outcomes modeling – to solve translational issues affecting AI, all in the setting of breast cancer screening. In the first aim, we assessed primary care providers’ (PCPs’) preferences for a hypothetical AI system for mammogram interpretation. We used qualitative interviewing to develop a discrete choice instrument, which we administered online to ninety-one PCPs from around the United States. While advances in improving AI’s diagnostic accuracy were important to respondents, they also reported valuing the diversity of training data and understandability of AI decision-making. The surveyed PCPs were broadly accepting of using AI to “triage” likely negative screens, so that radiologists do not need to interpret every image. In the second aim, we used outcomes modeling to compare the performance of 28 AI algorithms that had been developed for breast cancer screening. We first performed receiver operating characteristic (ROC) curve analysis to get a conventional metric (area under the curve) for model comparison. We then used a model of breast cancer screening and outcomes to estimate the quality-adjusted life years (QALYs) associated with using each model at its optimal operating point. These outcomes were compared with the outcomes associated with using two other methods of operating point selection – Youden’s index and decision curve analysis. Outcomes modeling ranked algorithms in the same order as area under ROC curve and did not produce substantially different outcomes at the QALY-optimizing operating point compared to the use of decision curve analysis. This suggests that outcomes modeling may be most useful in model comparison and operating point selection when detailed data including case heterogeneity is available

    Similar works

    Full text

    thumbnail-image