182 research outputs found

    An Interpretable Deep Hierarchical Semantic Convolutional Neural Network for Lung Nodule Malignancy Classification

    Full text link
    While deep learning methods are increasingly being applied to tasks such as computer-aided diagnosis, these models are difficult to interpret, do not incorporate prior domain knowledge, and are often considered as a "black-box." The lack of model interpretability hinders them from being fully understood by target users such as radiologists. In this paper, we present a novel interpretable deep hierarchical semantic convolutional neural network (HSCNN) to predict whether a given pulmonary nodule observed on a computed tomography (CT) scan is malignant. Our network provides two levels of output: 1) low-level radiologist semantic features, and 2) a high-level malignancy prediction score. The low-level semantic outputs quantify the diagnostic features used by radiologists and serve to explain how the model interprets the images in an expert-driven manner. The information from these low-level tasks, along with the representations learned by the convolutional layers, are then combined and used to infer the high-level task of predicting nodule malignancy. This unified architecture is trained by optimizing a global loss function including both low- and high-level tasks, thereby learning all the parameters within a joint framework. Our experimental results using the Lung Image Database Consortium (LIDC) show that the proposed method not only produces interpretable lung cancer predictions but also achieves significantly better results compared to common 3D CNN approaches

    Predicting Panel Ratings for Semantic Characteristics of Lung Nodules

    Get PDF
    In reading CT scans with potentially malignant lung nodules, radiologists make use of high level information (semantic characteristics) in their analysis. CAD systems can assist radiologists by offering a “second opinion” - predicting these semantic characteristics for lung nodules. In our previous work, we developed such a CAD system, training and testing it on the publicly available Lung Image Database Consortium (LIDC) dataset, which includes semantic annotations by up to four human radiologists for every nodule. However, due to the lack of ground truth and the uncertainty in the dataset, each nodule was viewed as four distinct instances when training the classifier. In this work, we propose a way of predicting the distribution of opinions of the four radiologists using a multiple-label classification algorithm based on belief decision trees. We evaluate our results using a distance-threshold curve and, measuring the area under this curve, obtain 69% accuracy on the testing subset. We conclude that multiple-label classification algorithms are an appropriate method of representing the diagnoses of multiple radiologists on lung CT scans when a single ground truth is not available

    Segmentation and classification of lung nodules from Thoracic CT scans : methods based on dictionary learning and deep convolutional neural networks.

    Get PDF
    Lung cancer is a leading cause of cancer death in the world. Key to survival of patients is early diagnosis. Studies have demonstrated that screening high risk patients with Low-dose Computed Tomography (CT) is invaluable for reducing morbidity and mortality. Computer Aided Diagnosis (CADx) systems can assist radiologists and care providers in reading and analyzing lung CT images to segment, classify, and keep track of nodules for signs of cancer. In this thesis, we propose a CADx system for this purpose. To predict lung nodule malignancy, we propose a new deep learning framework that combines Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) to learn best in-plane and inter-slice visual features for diagnostic nodule classification. Since a nodule\u27s volumetric growth and shape variation over a period of time may reveal information regarding the malignancy of nodule, separately, a dictionary learning based approach is proposed to segment the nodule\u27s shape at two time points from two scans, one year apart. The output of a CNN classifier trained to learn visual appearance of malignant nodules is then combined with the derived measures of shape change and volumetric growth in assigning a probability of malignancy to the nodule. Due to the limited number of available CT scans of benign and malignant nodules in the image database from the National Lung Screening Trial (NLST), we chose to initially train a deep neural network on the larger LUNA16 Challenge database which was built for the purpose of eliminating false positives from detected nodules in thoracic CT scans. Discriminative features that were learned in this application were transferred to predict malignancy. The algorithm for segmenting nodule shapes in serial CT scans utilizes a sparse combination of training shapes (SCoTS). This algorithm captures a sparse representation of a shape in input data through a linear span of previously delineated shapes in a training repository. The model updates shape prior over level set iterations and captures variabilities in shapes by a sparse combination of the training data. The level set evolution is therefore driven by a data term as well as a term capturing valid prior shapes. During evolution, the shape prior influence is adjusted based on shape reconstruction, with the assigned weight determined from the degree of sparsity of the representation. The discriminative nature of sparse representation, affords us the opportunity to compare nodules\u27 variations in consecutive time points and to predict malignancy. Experimental validations of the proposed segmentation algorithm have been demonstrated on 542 3-D lung nodule data from the LIDC-IDRI database which includes radiologist delineated nodule boundaries. The effectiveness of the proposed deep learning and dictionary learning architectures for malignancy prediction have been demonstrated on CT data from 370 biopsied subjects collected from the NLST database. Each subject in this database had at least two serial CT scans at two separate time points one year apart. The proposed RNN CAD system achieved an ROC Area Under the Curve (AUC) of 0.87, when validated on CT data from nodules at second sequential time point and 0.83 based on dictionary learning method; however, when nodule shape change and appearance were combined, the classifier performance improved to AUC=0.89

    Thoracic Disease Identification and Localization with Limited Supervision

    Full text link
    Accurate identification and localization of abnormalities from radiology images play an integral part in clinical diagnosis and treatment planning. Building a highly accurate prediction model for these tasks usually requires a large number of images manually annotated with labels and finding sites of abnormalities. In reality, however, such annotated data are expensive to acquire, especially the ones with location annotations. We need methods that can work well with only a small amount of location annotations. To address this challenge, we present a unified approach that simultaneously performs disease identification and localization through the same underlying model for all images. We demonstrate that our approach can effectively leverage both class information as well as limited location annotation, and significantly outperforms the comparative reference baseline in both classification and localization tasks.Comment: Conference on Computer Vision and Pattern Recognition 2018 (CVPR 2018). V1: CVPR submission; V2: +supplementary; V3: CVPR camera-ready; V4: correction, update reference baseline results according to their latest post; V5: minor correction; V6: Identification results using NIH data splits and various image model

    A Comparative Study of HARR Feature Extraction and Machine Learning Algorithms for Covid-19 X-Ray Image Classification

    Get PDF
    In this study, we investigated how effectively COVID-19 image categorization using Harr feature extraction and machine learning algorithms. We were particularly interested in the effectiveness of these algorithms. A dataset of 500 X-ray scans, equally split between 250 COVID-19-positive cases and 250 healthy controls, served as the basis for our study.  K-nearest neighbors,decision tree,  Linear regression, support vector machine, regression, classification, naive Bayes,random forest,  as well as linear discriminant analysis were among the seven machine-learning approaches used to categorize the photos. With the use of Harr feature extraction, the features of the pictures were extracted. We studied the efficacy of COVID-19 X-ray images for classification utilizing the combination of machine learning as well as the Harr feature extraction methods in the present investigation due to their effectiveness. We searched a database of 500 X-rays for this investigation, dividing them equally between groups of 250 patients with COVID-19-positive cases and 250 healthy people. Following that, the images were examined using seven various machine learning approaches for recognition. These methods included naive Bayes, linear discriminant analysis, random forests, classification,k-nearest neighbors,  and regression trees. The information from the photos was gathered using the Harr feature extraction method. The effectiveness of the algorithms was evaluated with the help of a variety of metrics, such asF1 score, precision,accuracy, recall, the area under the ROC curve, and the region of interest curve. According to our research, the Support Vector Machine algorithm had the highest accuracy, at 77%, while the Naive Bayes approach had the lowest accuracy, at 58%. By using machine learning and Harr feature extraction approaches, the Random Forest method yields the best results, based on our research. The development of future COVID-19 X-ray image-based automated diagnostic systems may be influenced by these findings. Results from the suggested model were comparable to those of cutting-edge models trained using transfer learning techniques. The proposed model's main advantage is that it has ten times fewer parameters than the most advanced models.A receiver operating characteristic (ROC) curve's F1 score, and the algorithms' accuracy, precision, the area under the curve,  and recall were all used as metrics. According to our findings, the Naive Bayes method gained the least accuracy (58%) and the Support Vector Machine method produced the highest accuracy (77%) when used. Our results reveal that employing Harr feature extraction and machine learning techniques, the Random Forest strategy is the most successful way to recognize COVID-19 X-ray pictures. These findings may be pertinent to the development of automated COVID-19 diagnosis tools relying on X-ray images. The recommended model produced results that were competitive when measured against cutting-edge models trained using transfer learning techniques. The suggested model employs 10 times fewer parameters than the most advanced models, which is its key selling point.&nbsp

    Adaptive Feature Engineering Modeling for Ultrasound Image Classification for Decision Support

    Get PDF
    Ultrasonography is considered a relatively safe option for the diagnosis of benign and malignant cancer lesions due to the low-energy sound waves used. However, the visual interpretation of the ultrasound images is time-consuming and usually has high false alerts due to speckle noise. Improved methods of collection image-based data have been proposed to reduce noise in the images; however, this has proved not to solve the problem due to the complex nature of images and the exponential growth of biomedical datasets. Secondly, the target class in real-world biomedical datasets, that is the focus of interest of a biopsy, is usually significantly underrepresented compared to the non-target class. This makes it difficult to train standard classification models like Support Vector Machine (SVM), Decision Trees, and Nearest Neighbor techniques on biomedical datasets because they assume an equal class distribution or an equal misclassification cost. Resampling techniques by either oversampling the minority class or under-sampling the majority class have been proposed to mitigate the class imbalance problem but with minimal success. We propose a method of resolving the class imbalance problem with the design of a novel data-adaptive feature engineering model for extracting, selecting, and transforming textural features into a feature space that is inherently relevant to the application domain. We hypothesize that by maximizing the variance and preserving as much variability in well-engineered features prior to applying a classifier model will boost the differentiation of the thyroid nodules (benign or malignant) through effective model building. Our proposed a hybrid approach of applying Regression and Rule-Based techniques to build our Feature Engineering and a Bayesian Classifier respectively. In the Feature Engineering model, we transformed images pixel intensity values into a high dimensional structured dataset and fitting a regression analysis model to estimate relevant kernel parameters to be applied to the proposed filter method. We adopted an Elastic Net Regularization path to control the maximum log-likelihood estimation of the Regression model. Finally, we applied a Bayesian network inference to estimate a subset for the textural features with a significant conditional dependency in the classification of the thyroid lesion. This is performed to establish the conditional influence on the textural feature to the random factors generated through our feature engineering model and to evaluate the success criterion of our approach. The proposed approach was tested and evaluated on a public dataset obtained from thyroid cancer ultrasound diagnostic data. The analyses of the results showed that the classification performance had a significant improvement overall for accuracy and area under the curve when then proposed feature engineering model was applied to the data. We show that a high performance of 96.00% accuracy with a sensitivity and specificity of 99.64%) and 90.23% respectively was achieved for a filter size of 13 Ă— 13

    Computational methods for the analysis of functional 4D-CT chest images.

    Get PDF
    Medical imaging is an important emerging technology that has been intensively used in the last few decades for disease diagnosis and monitoring as well as for the assessment of treatment effectiveness. Medical images provide a very large amount of valuable information that is too huge to be exploited by radiologists and physicians. Therefore, the design of computer-aided diagnostic (CAD) system, which can be used as an assistive tool for the medical community, is of a great importance. This dissertation deals with the development of a complete CAD system for lung cancer patients, which remains the leading cause of cancer-related death in the USA. In 2014, there were approximately 224,210 new cases of lung cancer and 159,260 related deaths. The process begins with the detection of lung cancer which is detected through the diagnosis of lung nodules (a manifestation of lung cancer). These nodules are approximately spherical regions of primarily high density tissue that are visible in computed tomography (CT) images of the lung. The treatment of these lung cancer nodules is complex, nearly 70% of lung cancer patients require radiation therapy as part of their treatment. Radiation-induced lung injury is a limiting toxicity that may decrease cure rates and increase morbidity and mortality treatment. By finding ways to accurately detect, at early stage, and hence prevent lung injury, it will have significant positive consequences for lung cancer patients. The ultimate goal of this dissertation is to develop a clinically usable CAD system that can improve the sensitivity and specificity of early detection of radiation-induced lung injury based on the hypotheses that radiated lung tissues may get affected and suffer decrease of their functionality as a side effect of radiation therapy treatment. These hypotheses have been validated by demonstrating that automatic segmentation of the lung regions and registration of consecutive respiratory phases to estimate their elasticity, ventilation, and texture features to provide discriminatory descriptors that can be used for early detection of radiation-induced lung injury. The proposed methodologies will lead to novel indexes for distinguishing normal/healthy and injured lung tissues in clinical decision-making. To achieve this goal, a CAD system for accurate detection of radiation-induced lung injury that requires three basic components has been developed. These components are the lung fields segmentation, lung registration, and features extraction and tissue classification. This dissertation starts with an exploration of the available medical imaging modalities to present the importance of medical imaging in today’s clinical applications. Secondly, the methodologies, challenges, and limitations of recent CAD systems for lung cancer detection are covered. This is followed by introducing an accurate segmentation methodology of the lung parenchyma with the focus of pathological lungs to extract the volume of interest (VOI) to be analyzed for potential existence of lung injuries stemmed from the radiation therapy. After the segmentation of the VOI, a lung registration framework is introduced to perform a crucial and important step that ensures the co-alignment of the intra-patient scans. This step eliminates the effects of orientation differences, motion, breathing, heart beats, and differences in scanning parameters to be able to accurately extract the functionality features for the lung fields. The developed registration framework also helps in the evaluation and gated control of the radiotherapy through the motion estimation analysis before and after the therapy dose. Finally, the radiation-induced lung injury is introduced, which combines the previous two medical image processing and analysis steps with the features estimation and classification step. This framework estimates and combines both texture and functional features. The texture features are modeled using the novel 7th-order Markov Gibbs random field (MGRF) model that has the ability to accurately models the texture of healthy and injured lung tissues through simultaneously accounting for both vertical and horizontal relative dependencies between voxel-wise signals. While the functionality features calculations are based on the calculated deformation fields, obtained from the 4D-CT lung registration, that maps lung voxels between successive CT scans in the respiratory cycle. These functionality features describe the ventilation, the air flow rate, of the lung tissues using the Jacobian of the deformation field and the tissues’ elasticity using the strain components calculated from the gradient of the deformation field. Finally, these features are combined in the classification model to detect the injured parts of the lung at an early stage and enables an earlier intervention
    • …
    corecore