Developing and Applying CAD-generated Image Markers to Assist Disease Diagnosis and Prognosis Prediction

Abstract

Developing computer-aided detection and/or diagnosis (CAD) schemes has been an active research topic in medical imaging informatics (MII) with promising results in assisting clinicians in making better diagnostic and/or clinical decisions in the last two decades. To build robust CAD schemes, we need to develop state-of-the-art image processing and machine learning (ML) algorithms to optimize each step in the CAD pipeline, including detection and segmentation of the region of interest, optimal feature generation, followed by integration to ML classifiers. In my dissertation, I conducted multiple studies investigating the feasibility of developing several novel CAD schemes in the field of medicine concerning different purposes. The first study aims to investigate how to optimally develop a CAD scheme of contrast-enhanced digital mammography (CEDM) images to classify breast masses. CEDM includes both low energy (LE) and dual-energy subtracted (DES) images. A CAD scheme was applied to segment mass regions depicting LE and DES images separately. Optimal segmentation results generated from DES images were also mapped to LE images or vice versa. After computing image features, multilayer perceptron-based ML classifiers integrated with a correlation-based feature subset evaluator and leave-one-case-out cross-validation method were built to classify mass regions. The study demonstrated that DES images eliminated the overlapping effect of dense breast tissue, which helps improve mass segmentation accuracy. By mapping mass regions segmented from DES images to LE images, CAD yields significantly improved performance. The second study aims to develop a new quantitative image marker computed from the pre-intervention computed tomography perfusion (CTP) images and evaluate its feasibility to predict clinical outcome among acute ischemic stroke (AIS) patients undergoing endovascular mechanical thrombectomy after diagnosis of large vessel occlusion. A CAD scheme is first developed to pre-process CTP images of different scanning series for each study case, perform image segmentation, quantify contrast-enhanced blood volumes in bilateral cerebral hemispheres, and compute image features related to asymmetrical cerebral blood flow patterns based on the cumulative cerebral blood flow curves of two hemispheres. Next, image markers based on a single optimal feature and ML models fused with multi-features are developed and tested to classify AIS cases into two classes of good and poor prognosis based on the Modified Rankin Scale. The study results show that ML model trained using multiple features yields significantly higher classification performance than the image marker using the best single feature (p<0.01). This study demonstrates the feasibility of developing a new CAD scheme to predict the prognosis of AIS patients in the hyperacute stage, which has the potential to assist clinicians in optimally treating and managing AIS patients. The third study aims to develop and test a new CAD scheme to predict prognosis in aneurysmal subarachnoid hemorrhage (aSAH) patients using brain CT images. Each patient had two sets of CT images acquired at admission and prior to discharge. CAD scheme was applied to segment intracranial brain regions into four subregions, namely, cerebrospinal fluid (CSF), white matter (WM), gray matter (GM), and extraparenchymal blood (EPB), respectively. CAD then computed nine image features related to 5 volumes of the segmented sulci, EPB, CSF, WM, GM, and four volumetrical ratios to sulci. Subsequently, 16 ML models were built using multiple features computed either from CT images acquired at admission or prior to discharge to predict eight prognosis related parameters. The results show that ML models trained using CT images acquired at admission yielded higher accuracy to predict short-term clinical outcomes, while ML models trained using CT images acquired prior to discharge had higher accuracy in predicting long-term clinical outcomes. Thus, this study demonstrated the feasibility of predicting the prognosis of aSAH patients using new ML model-generated quantitative image markers. The fourth study aims to develop and test a new interactive computer-aided detection (ICAD) tool to quantitatively assess hemorrhage volumes. After loading each case, the ICAD tool first segments intracranial brain volume, performs CT labeling of each voxel. Next, contour-guided image-thresholding techniques based on CT Hounsfield Unit are used to estimate and segment hemorrhage-associated voxels (ICH). Next, two experienced neurology residents examine and correct the markings of ICH categorized into either intraparenchymal hemorrhage (IPH) or intraventricular hemorrhage (IVH) to obtain the true markings. Additionally, volumes and maximum two-dimensional diameter of each sub-type of hemorrhage are also computed for understanding ICH prognosis. The performance to segment hemorrhage regions between semi-automated ICAD and the verified neurology residents’ true markings is evaluated using dice similarity coefficient (DSC). The data analysis results in the study demonstrate that the new ICAD tool enables to segment and quantify ICH and other hemorrhage volumes with higher DSC. Finally, the fifth study aims to bridge the gap between traditional radiomics and deep learning systems by comparing and assessing these two technologies in classifying breast lesions. First, one CAD scheme is applied to segment lesions and compute radiomics features. In contrast, another scheme applies a pre-trained residual net architecture (ResNet50) as a transfer learning model to extract automated features. Next, the principal component algorithm processes both initially computed radiomics and automated features to create optimal feature vectors. Then, several support vector machine (SVM) classifiers are built using the optimized radiomics or automated features. This study indicates that (1) CAD built using only deep transfer learning yields higher classification performance than the traditional radiomic-based model, (2) SVM trained using the fused radiomics and automated features does not yield significantly higher AUC, and (3) radiomics and automated features contain highly correlated information in lesion classification. In summary, in all these studies, I developed and investigated several key concepts of CAD pipeline, including (i) pre-processing algorithms, (ii) automatic detection and segmentation schemes, (iii) feature extraction and optimization methods, and (iv) ML and data analysis models. All developed CAD models are embedded with interactive and visually aided graphical user interfaces (GUIs) to provide user functionality. These techniques present innovative approaches for building quantitative image markers to build optimal ML models. The study results indicate the underlying CAD scheme's potential application to assist radiologists in clinical settings for their assessments in diagnosing disease and improving their overall performance

    Similar works