7 research outputs found

    Transferable Multi-model Ensemble for Benign-Malignant Lung Nodule Classification on Chest CT

    Get PDF
    The classification of benign versus malignant lung nodules using chest CT plays a pivotal role in the early detection of lung cancer and this early detection has the best chance of cure. Although deep learning is now the most successful solution for image classification problems, it requires a myriad number of training data, which are not usually readily available for most routine medical imaging applications. In this paper, we propose the transferable multi-model ensemble (TMME) algorithm to separate malignant from benign lung nodules using limited chest CT data. This algorithm transfers the image representation abilities of three ResNet-50 models, which were pre-trained on the ImageNet database, to characterize the overall appearance, heterogeneity of voxel values and heterogeneity of shape of lung nodules, respectively, and jointly utilizes them to classify lung nodules with an adaptive weighting scheme learned during the error back propagation. Experimental results on the benchmark LIDC-IDRI dataset show that our proposed TMME algorithm achieves a lung nodule classification accuracy of 93.40%, which is markedly higher than the accuracy of seven state-of-the-art approaches

    Parse and Recall: Towards Accurate Lung Nodule Malignancy Prediction like Radiologists

    Full text link
    Lung cancer is a leading cause of death worldwide and early screening is critical for improving survival outcomes. In clinical practice, the contextual structure of nodules and the accumulated experience of radiologists are the two core elements related to the accuracy of identification of benign and malignant nodules. Contextual information provides comprehensive information about nodules such as location, shape, and peripheral vessels, and experienced radiologists can search for clues from previous cases as a reference to enrich the basis of decision-making. In this paper, we propose a radiologist-inspired method to simulate the diagnostic process of radiologists, which is composed of context parsing and prototype recalling modules. The context parsing module first segments the context structure of nodules and then aggregates contextual information for a more comprehensive understanding of the nodule. The prototype recalling module utilizes prototype-based learning to condense previously learned cases as prototypes for comparative analysis, which is updated online in a momentum way during training. Building on the two modules, our method leverages both the intrinsic characteristics of the nodules and the external knowledge accumulated from other nodules to achieve a sound diagnosis. To meet the needs of both low-dose and noncontrast screening, we collect a large-scale dataset of 12,852 and 4,029 nodules from low-dose and noncontrast CTs respectively, each with pathology- or follow-up-confirmed labels. Experiments on several datasets demonstrate that our method achieves advanced screening performance on both low-dose and noncontrast scenarios.Comment: MICCAI 202

    Hierarchical Classification of Pulmonary Lesions: A Large-Scale Radio-Pathomics Study

    Full text link
    Diagnosis of pulmonary lesions from computed tomography (CT) is important but challenging for clinical decision making in lung cancer related diseases. Deep learning has achieved great success in computer aided diagnosis (CADx) area for lung cancer, whereas it suffers from label ambiguity due to the difficulty in the radiological diagnosis. Considering that invasive pathological analysis serves as the clinical golden standard of lung cancer diagnosis, in this study, we solve the label ambiguity issue via a large-scale radio-pathomics dataset containing 5,134 radiological CT images with pathologically confirmed labels, including cancers (e.g., invasive/non-invasive adenocarcinoma, squamous carcinoma) and non-cancer diseases (e.g., tuberculosis, hamartoma). This retrospective dataset, named Pulmonary-RadPath, enables development and validation of accurate deep learning systems to predict invasive pathological labels with a non-invasive procedure, i.e., radiological CT scans. A three-level hierarchical classification system for pulmonary lesions is developed, which covers most diseases in cancer-related diagnosis. We explore several techniques for hierarchical classification on this dataset, and propose a Leaky Dense Hierarchy approach with proven effectiveness in experiments. Our study significantly outperforms prior arts in terms of data scales (6x larger), disease comprehensiveness and hierarchies. The promising results suggest the potentials to facilitate precision medicine.Comment: MICCAI 2020 (Early Accepted

    Lung nodules identification in CT scans using multiple instance learning.

    Get PDF
    Computer Aided Diagnosis (CAD) systems for lung nodules diagnosis aim to classify nodules into benign or malignant based on images obtained from diverse imaging modalities such as Computer Tomography (CT). Automated CAD systems are important in medical domain applications as they assist radiologists in the time-consuming and labor-intensive diagnosis process. However, most available methods require a large collection of nodules that are segmented and annotated by radiologists. This process is labor-intensive and hard to scale to very large datasets. More recently, some CAD systems that are based on deep learning have emerged. These algorithms do not require the nodules to be segmented, and radiologists need to only provide the center of mass of each nodule. The training image patches are then extracted from volumes of fixed-sized centered at the provided nodule\u27s center. However, since the size of nodules can vary significantly, one fixed size volume may not represent all nodules effectively. This thesis proposes a Multiple Instance Learning (MIL) approach to address the above limitations. In MIL, each nodule is represented by a nested sequence of volumes centered at the identified center of the nodule. We extract one feature vector from each volume. The set of features for each nodule are combined and represented by a bag. Next, we investigate and adapt some existing algorithms and develop new ones for this application. We start by applying benchmark MIL algorithms to traditional Gray Level Co-occurrence Matrix (GLCM) engineered features. Then, we design and train simple Convolutional Neural Networks (CNNs) to learn and extract features that characterize lung nodules. These extracted features are then fed to a benchmark MIL algorithm to learn a classification model. Finally, we develop new algorithms (MIL-CNN) that combine feature learning and multiple instance classification in a single network. These algorithms generalize the CNN architecture to multiple instance data. We design and report the results of three experiments applied on both generative (GLCM) and learned (CNN) features using two datasets (The Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) \cite{armato2011lung} and the National Lung Screening Trial (NLST) \cite{national2011reduced}). Two of these experiments perform five-fold cross-validations on the same dataset (NLST or LIDC). The third experiment trains the algorithms on one collection (NLST dataset) and tests it on the other (LIDC dataset). We designed our experiments to compare the different features, compare MIL versus Single Instance Learning (SIL) where a single feature vector represents a nodule, and compare our proposed end-to-end MIL approaches to existing benchmark MIL methods. We demonstrate that our proposed MIL-CNN frameworks are more accurate for the lung nodules diagnosis task. We also show that MIL representation achieves better results than SIL applied on the ground truth region of each nodule

    Machine learning approaches for lung cancer diagnosis.

    Get PDF
    The enormity of changes and development in the field of medical imaging technology is hard to fathom, as it does not just represent the technique and process of constructing visual representations of the body from inside for medical analysis and to reveal the internal structure of different organs under the skin, but also it provides a noninvasive way for diagnosis of various disease and suggest an efficient ways to treat them. While data surrounding all of our lives are stored and collected to be ready for analysis by data scientists, medical images are considered a rich source that could provide us with a huge amount of data, that could not be read easily by physicians and radiologists, with valuable information that could be used in smart ways to discover new knowledge from these vast quantities of data. Therefore, the design of computer-aided diagnostic (CAD) system, that can be approved for use in clinical practice that aid radiologists in diagnosis and detecting potential abnormalities, is of a great importance. This dissertation deals with the development of a CAD system for lung cancer diagnosis, which is the second most common cancer in men after prostate cancer and in women after breast cancer. Moreover, lung cancer is considered the leading cause of cancer death among both genders in USA. Recently, the number of lung cancer patients has increased dramatically worldwide and its early detection doubles a patient’s chance of survival. Histological examination through biopsies is considered the gold standard for final diagnosis of pulmonary nodules. Even though resection of pulmonary nodules is the ideal and most reliable way for diagnosis, there is still a lot of different methods often used just to eliminate the risks associated with the surgical procedure. Lung nodules are approximately spherical regions of primarily high density tissue that are visible in computed tomography (CT) images of the lung. A pulmonary nodule is the first indication to start diagnosing lung cancer. Lung nodules can be benign (normal subjects) or malignant (cancerous subjects). Large (generally defined as greater than 2 cm in diameter) malignant nodules can be easily detected with traditional CT scanning techniques. However, the diagnostic options for small indeterminate nodules are limited due to problems associated with accessing small tumors. Therefore, additional diagnostic and imaging techniques which depends on the nodules’ shape and appearance are needed. The ultimate goal of this dissertation is to develop a fast noninvasive diagnostic system that can enhance the accuracy measures of early lung cancer diagnosis based on the well-known hypotheses that malignant nodules have different shape and appearance than benign nodules, because of the high growth rate of the malignant nodules. The proposed methodologies introduces new shape and appearance features which can distinguish between benign and malignant nodules. To achieve this goal a CAD system is implemented and validated using different datasets. This CAD system uses two different types of features integrated together to be able to give a full description to the pulmonary nodule. These two types are appearance features and shape features. For the appearance features different texture appearance descriptors are developed, namely the 3D histogram of oriented gradient, 3D spherical sector isosurface histogram of oriented gradient, 3D adjusted local binary pattern, 3D resolved ambiguity local binary pattern, multi-view analytical local binary pattern, and Markov Gibbs random field. Each one of these descriptors gives a good description for the nodule texture and the level of its signal homogeneity which is a distinguishable feature between benign and malignant nodules. For the shape features multi-view peripheral sum curvature scale space, spherical harmonics expansions, and different group of fundamental geometric features are utilized to describe the nodule shape complexity. Finally, the fusion of different combinations of these features, which is based on two stages is introduced. The first stage generates a primary estimation for every descriptor. Followed by the second stage that consists of an autoencoder with a single layer augmented with a softmax classifier to provide us with the ultimate classification of the nodule. These different combinations of descriptors are combined into different frameworks that are evaluated using different datasets. The first dataset is the Lung Image Database Consortium which is a benchmark publicly available dataset for lung nodule detection and diagnosis. The second dataset is our local acquired computed tomography imaging data that has been collected from the University of Louisville hospital and the research protocol was approved by the Institutional Review Board at the University of Louisville (IRB number 10.0642). These frameworks accuracy was about 94%, which make the proposed frameworks demonstrate promise to be valuable tool for the detection of lung cancer
    corecore