2,027 research outputs found

    Machine learning approaches for lung cancer diagnosis.

    Get PDF
    The enormity of changes and development in the field of medical imaging technology is hard to fathom, as it does not just represent the technique and process of constructing visual representations of the body from inside for medical analysis and to reveal the internal structure of different organs under the skin, but also it provides a noninvasive way for diagnosis of various disease and suggest an efficient ways to treat them. While data surrounding all of our lives are stored and collected to be ready for analysis by data scientists, medical images are considered a rich source that could provide us with a huge amount of data, that could not be read easily by physicians and radiologists, with valuable information that could be used in smart ways to discover new knowledge from these vast quantities of data. Therefore, the design of computer-aided diagnostic (CAD) system, that can be approved for use in clinical practice that aid radiologists in diagnosis and detecting potential abnormalities, is of a great importance. This dissertation deals with the development of a CAD system for lung cancer diagnosis, which is the second most common cancer in men after prostate cancer and in women after breast cancer. Moreover, lung cancer is considered the leading cause of cancer death among both genders in USA. Recently, the number of lung cancer patients has increased dramatically worldwide and its early detection doubles a patient’s chance of survival. Histological examination through biopsies is considered the gold standard for final diagnosis of pulmonary nodules. Even though resection of pulmonary nodules is the ideal and most reliable way for diagnosis, there is still a lot of different methods often used just to eliminate the risks associated with the surgical procedure. Lung nodules are approximately spherical regions of primarily high density tissue that are visible in computed tomography (CT) images of the lung. A pulmonary nodule is the first indication to start diagnosing lung cancer. Lung nodules can be benign (normal subjects) or malignant (cancerous subjects). Large (generally defined as greater than 2 cm in diameter) malignant nodules can be easily detected with traditional CT scanning techniques. However, the diagnostic options for small indeterminate nodules are limited due to problems associated with accessing small tumors. Therefore, additional diagnostic and imaging techniques which depends on the nodules’ shape and appearance are needed. The ultimate goal of this dissertation is to develop a fast noninvasive diagnostic system that can enhance the accuracy measures of early lung cancer diagnosis based on the well-known hypotheses that malignant nodules have different shape and appearance than benign nodules, because of the high growth rate of the malignant nodules. The proposed methodologies introduces new shape and appearance features which can distinguish between benign and malignant nodules. To achieve this goal a CAD system is implemented and validated using different datasets. This CAD system uses two different types of features integrated together to be able to give a full description to the pulmonary nodule. These two types are appearance features and shape features. For the appearance features different texture appearance descriptors are developed, namely the 3D histogram of oriented gradient, 3D spherical sector isosurface histogram of oriented gradient, 3D adjusted local binary pattern, 3D resolved ambiguity local binary pattern, multi-view analytical local binary pattern, and Markov Gibbs random field. Each one of these descriptors gives a good description for the nodule texture and the level of its signal homogeneity which is a distinguishable feature between benign and malignant nodules. For the shape features multi-view peripheral sum curvature scale space, spherical harmonics expansions, and different group of fundamental geometric features are utilized to describe the nodule shape complexity. Finally, the fusion of different combinations of these features, which is based on two stages is introduced. The first stage generates a primary estimation for every descriptor. Followed by the second stage that consists of an autoencoder with a single layer augmented with a softmax classifier to provide us with the ultimate classification of the nodule. These different combinations of descriptors are combined into different frameworks that are evaluated using different datasets. The first dataset is the Lung Image Database Consortium which is a benchmark publicly available dataset for lung nodule detection and diagnosis. The second dataset is our local acquired computed tomography imaging data that has been collected from the University of Louisville hospital and the research protocol was approved by the Institutional Review Board at the University of Louisville (IRB number 10.0642). These frameworks accuracy was about 94%, which make the proposed frameworks demonstrate promise to be valuable tool for the detection of lung cancer

    Convolutional neural networks for myocardial perfusion SPECT imaging classification: a full and low-dose study

    Get PDF
    Thesis to obtain the Master of Science Degree in Biomedical EngineeringAs doenças cardiovasculares são a principal causa de morbilidade e mortalidade a nível mundial. A doença arterial coronária, uma das formas mais prevalentes das doenças cardiovasculares, caracteriza-se pela deposição de placas de ateroma nas artérias coronárias, levando consequentemente a diminuições da passagem do sangue pelas artérias, fenómeno este denominado por isquémia. As isquémias podem ser reversíveis, quando existe uma diminuição da passagem do sangue temporária, causada por exercício físico ou testes de esforço farmacológico ou irreversíveis quando o estreitamento e oclusão do vaso não é revertido em situações de repouso, levando à necrose dos tecidos. Neste sentido, a cintigrafia de perfusão do miocárdio (CPM) por tomografia computorizada por emissão de fotão único (SPECT) é um exame não-invasivo muito utilizado e designado para detetar alterações de perfusão no músculo cardíaco. Na prática, a SPECT utiliza traçadores radioativos que depois de administrados no doente permitem uma visualização do mesmo pelo corpo ou órgão de interesse. Na CPM, após administrado o radiofármaco, este é fixado parcialmente no miocárdio proporcionalmente ao fluxo sanguíneo do músculo. Na prática, em áreas indicativas de baixa perfusão, pode ser indicativo de isquémia reversível ou irreversível (enfarte). Se a região lesada se apresentar menos perfundida num estudo em esforço que num estudo em repouso, estaremos perante uma isquémia reversível. Se a região não apresentar reversibilidade de perfusão no estudo em repouso, estamos perante de uma isquémia irreversível e, portanto, um enfarte. Neste sentido, na CPM, é a comparação entre o estudo em esforço e em repouso que indica a presença ou ausência de estenoses. A quantificação dos estudos de CPM-SPECT envolve tipicamente a extração de parâmetros quantitativos obtidos dos estudos de esforço e repouso. Para isso, são utilizados softwares comercialmente disponíveis como o QPSTM/QGSTM e 4D-MSPECT. No entanto, os sistemas de aquisição sofrem de algumas limitações como a distorção espacial, atenuação dispersa e baixas contagens, o que pode contribuir para elevados níveis de ruído nas imagens. Assim, este aspeto pode levar ao viés na quantificação e consequente classificação, para além de serem processos demorados e muito dependentes do utilizador, levando a variabilidades intra e inter-operador significantes. Além disso, a CPM-SPECT sendo um exame que comporta radiação ionizante para o doente, é o que mais contribui em Portugal para a dose coletiva da população desde 2010. Para ultrapassar tais limitações, novos equipamentos especializados em imagem cardíaca nuclear têm sido desenvolvidos como os de Cádmio-zinco-telureto (CZT), permitindo uma melhor qualidade de imagem e redução de dose. Outra abordagem foi implementar algoritmos iterativos na reconstrução de imagem, melhorando também a relação sinal-ruído das mesmas. No entanto, ambos os métodos estão associados a custos elevados, sendo difíceis de implementar em muitos serviços. Neste sentido, tanto as questões na variabilidade de classificação como as preocupações de redução da dose podem impactar a verdadeira avaliação de estudos CPM-SPECT. Recentemente, com a ascensão do big-data e novos algoritmos de aprendizagem profunda como as redes neuronais, estes começaram a ser utilizados também no contexto da imagiologia médica, particularmente também na CPM. Para contexto, muitas arquiteturas de redes neuronais têm sido utilizadas nas mais variadas tarefas da imagiologia, sendo as redes neuronais convolucionais a que mais se destacou. Uma razão para tal, é a facilidade da sua utilização através de transfer learning, ou seja, a utilização de redes neuronais convolucionais pré-treinadas numa base de dados de grandes dimensões, como por exemplo o ImageNet, em que se congela os pesos das camadas anteriores ou parte deles, de forma a usar o que a rede aprendeu inicialmente num outro problema de classificação. Em adendo, com o desenvolvimento de arquiteturas cada vez mais complexas, a interpretabilidade destes modelos tornou-se mais difícil. Para isso, vários métodos foram desenvolvidos como forma de melhor perceber a tomada de decisão dos algoritmos nas mais variadas tarefas, dentro dos quais destacamos o Grad-CAM. Neste sentido, e para contexto, as redes neuronais convolucionais têm sido utilizadas na cardiologia nuclear nas mais variadas tarefas como na otimização da qualidade de imagens, na reconstrução de dados e na classificação de estudos para apoio ao diagnóstico. Adicionalmente, vários investigadores têm estudado particularmente a classificação em imagens de baixa dose depois de sobreamostragem para dose convencional. No entanto, a literatura ainda carece de estudos relativos à classificação de imagens de dose convencional e reduzida com posterior comparação entre ambas. Neste projeto temos como objetivo desenvolver um modelo de classificação de imagens reais full-time e low-time de perfusão do miocárdio, avaliando a aplicação do uso de imagens sintéticas geradas a partir do método de reamostragem de Poisson em tarefas de classificação.Myocardial perfusion imaging (MPI) by single-photon emission computed tomography (SPECT) plays a crucial role in the diagnosis of coronary artery disease. Moreover, the quantification of these images typically involves the extraction of quantitative parameters obtained from the rest stress perfusion. However, the acquisition systems have some limitations such as spatial blurring and low-count data, which may introduce bias in the classification. Additionally, these processes are time-consuming and user-dependent, leading to significant intra and inter-operator variability. Furthermore, over the years there has been a constant effort to reduce the dose of MPI. In this sense, both the variability classification issues and the dose reduction concerns can impact the true assessment of SPECT-MPI. In recent years, with the rise of artificial intelligence algorithms, several studies have proposed automatic Deep Learning techniques for the classification of MPI, moreover regarding low-count data. In this project, we ran 5 Convolutional Neural Network models with pre-trained weights: one trained on stress real full-time data (100%, as 100R), three individual models with synthetic 75%, 50% and 25% count settings and another one with all datasets combined (ALL). Thus, we compared their performance when tested on full-time and low-time studies and assessed the application of synthetic subsampled data from the Poisson Resampling technique in SPECT-MPI classification tasks. As a conclusion, both 100R and ALL models achieved good and similar results when tested with real full-time (100R model achieved an accuracy of 0.70 and ALL model achieved na accuracy of 0.65) and real low-time at 75% (both models achieved and accuracy of 0.71). Bellow this percentage, the models’ accuracy began to drop, possibly due to the limited information these images contain. Thus, subsampled data from a Poisson resampling method may be a possible solution to conduct further studies regarding the classification of low-time SPECT-MPI.N/

    ssROC: Semi-Supervised ROC Analysis for Reliable and Streamlined Evaluation of Phenotyping Algorithms

    Full text link
    Objective:\textbf{Objective:} High-throughput phenotyping will accelerate the use of electronic health records (EHRs) for translational research. A critical roadblock is the extensive medical supervision required for phenotyping algorithm (PA) estimation and evaluation. To address this challenge, numerous weakly-supervised learning methods have been proposed to estimate PAs. However, there is a paucity of methods for reliably evaluating the predictive performance of PAs when a very small proportion of the data is labeled. To fill this gap, we introduce a semi-supervised approach (ssROC) for estimation of the receiver operating characteristic (ROC) parameters of PAs (e.g., sensitivity, specificity). Materials and Methods:\textbf{Materials and Methods:} ssROC uses a small labeled dataset to nonparametrically impute missing labels. The imputations are then used for ROC parameter estimation to yield more precise estimates of PA performance relative to classical supervised ROC analysis (supROC) using only labeled data. We evaluated ssROC through in-depth simulation studies and an extensive evaluation of eight PAs from Mass General Brigham. Results:\textbf{Results:} In both simulated and real data, ssROC produced ROC parameter estimates with significantly lower variance than supROC for a given amount of labeled data. For the eight PAs, our results illustrate that ssROC achieves similar precision to supROC, but with approximately 60% of the amount of labeled data on average. Discussion:\textbf{Discussion:} ssROC enables precise evaluation of PA performance to increase trust in observational health research without demanding large volumes of labeled data. ssROC is also easily implementable in open-source R\texttt{R} software. Conclusion:\textbf{Conclusion:} When used in conjunction with weakly-supervised PAs, ssROC facilitates the reliable and streamlined phenotyping necessary for EHR-based research

    Modular Machine Learning Methods for Computer-Aided Diagnosis of Breast Cancer

    Get PDF
    The purpose of this study was to improve breast cancer diagnosis by reducing the number of benign biopsies performed. To this end, we investigated modular and ensemble systems of machine learning methods for computer-aided diagnosis (CAD) of breast cancer. A modular system partitions the input space into smaller domains, each of which is handled by a local model. An ensemble system uses multiple models for the same cases and combines the models\u27 predictions. Five supervised machine learning techniques (LDA, SVM, BP-ANN, CBR, CART) were trained to predict the biopsy outcome from mammographic findings (BIRADS™) and patient age based on a database of 2258 cases mixed from multiple institutions. The generalization of the models was tested on second set of 2177 cases. Clusters were identified in the database using a priori knowledge and unsupervised learning methods (agglomerative hierarchical clustering followed by K-Means, SOM, AutoClass). The performance of the global models over the clusters was examined and local models were trained for clusters. While some local models were superior to some global models, we were unable to build a modular CAD system that was better than the global BP-ANN model. The ensemble systems based on simplistic combination schemes did not result in significant improvements and more complicated combination schemes were found to be unduly optimistic. One of the most striking results of this dissertation was that CAD systems trained on a mixture of lesion types performed much better on masses than on calcifications. Our study of the institutional effects suggests that models built on cases mixed between institutions may overcome some of the weaknesses of models built on cases from a single institution. It was suggestive that each of the unsupervised methods identified a cluster of younger women with well-circumscribed or obscured, oval-shaped masses that accounted for the majority of the BP-ANN’s recommendations for follow up. From the cluster analysis and the CART models, we determined a simple diagnostic rule that performed comparably to the global BP-ANN. Approximately 98% sensitivity could be maintained while providing approximately 26% specificity. This should be compared to the clinical status quo of 100% sensitivity and 0% specificity on this database of indeterminate cases already referred to biopsy
    • …
    corecore