Label Uncertainty and Learning Using Partially Available Privileged Information for Clinical Decision Support: Applications in Detection of Acute Respiratory Distress Syndrome

Abstract

Artificial intelligence and machine learning have the potential to transform health care by deriving new and important insights from the vast amount of data generated during routine delivery of healthcare. The digitization of health data provides an important opportunity for new knowledge discovery and improved care delivery through the development of clinical decision support that can leverage this data to support various aspects of healthcare - from early diagnosis to epidemiology, drug development, and robotic-assisted surgery. These diverse efforts share the ultimate goal of improving quality of care and outcome for patients. This thesis aims to tackle long-standing problems in machine learning and healthcare, such as modeling label uncertainty (e.g., from ambiguity in diagnosis or poorly labeled examples) and representation of data that may not be reliably accessible in a live environment. Label uncertainty hinges on the fact that even clinical experts may have low confidence when assigning a medical diagnosis to some patients due to ambiguity in the case or imperfect reliability of the diagnostic criteria. As a result, some data used for machine training may be mislabeled, hindering the model’s ability to learn the complexity of the underlying task and adversely affecting the algorithm’s overall performance. In this work, I describe a heuristic approach for physicians to quantify their diagnostic uncertainty. I also propose an implementation of instance-weighted support vector machines to incorporate this information during model training. To address the issue of unreliable data, this thesis examines the idea of learning using “partially available” privileged information. This paradigm, based on knowledge transfer, allows for models to use additional data available during training but may not be accessible during testing/deployment. This type of data is abundant in healthcare, where much more information about a patient’s health status is available in retrospective analysis (e.g., in the training data) but not available in real-time environments (e.g., in the test set). In this thesis, “privileged information” are features extracted from chest x-rays (CXRs) using novel feature engineering algorithms and transfer learning with deep residual networks. This example works well for numerous clinical applications, since CXRs are retrospectively accessible during model training but may not be available in a live environment due to delay from ordering, developing, and processing the request. This thesis is motivated by improving diagnosis of acute respiratory distress syndrome (ARDS), a life-threatening lung injury associated with high mortality. The diagnosis of ARDS serves as a model for many medical conditions where standard tests are not routinely available and diagnostic uncertainty is common. While this thesis focuses on improving diagnosis of ARDS, the proposed learning methods will generalize across various healthcare settings, allowing for better characterization of patient health status and improving the overall quality of patient care. This thesis also includes development of methods for time-series analysis of longitudinal health data, signal processing techniques for quality assessment, lung segmentation from complex CXRs, and novel feature extraction algorithm for quantification of pulmonary opacification. These algorithms were tested and validated on data obtained from patients at Michigan Medicine and additional external sources. These studies demonstrate that careful, principled use of methodologies in machine learning and artificial intelligence can potentially assist healthcare providers with early detection of ARDS and help make a timely, accurate medical diagnosis to improve outcomes for patients.PHDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167930/1/nreamaro_1.pd

    Similar works

    Full text

    thumbnail-image