53 research outputs found

    Lung nodules identification in CT scans using multiple instance learning.

    Get PDF
    Computer Aided Diagnosis (CAD) systems for lung nodules diagnosis aim to classify nodules into benign or malignant based on images obtained from diverse imaging modalities such as Computer Tomography (CT). Automated CAD systems are important in medical domain applications as they assist radiologists in the time-consuming and labor-intensive diagnosis process. However, most available methods require a large collection of nodules that are segmented and annotated by radiologists. This process is labor-intensive and hard to scale to very large datasets. More recently, some CAD systems that are based on deep learning have emerged. These algorithms do not require the nodules to be segmented, and radiologists need to only provide the center of mass of each nodule. The training image patches are then extracted from volumes of fixed-sized centered at the provided nodule\u27s center. However, since the size of nodules can vary significantly, one fixed size volume may not represent all nodules effectively. This thesis proposes a Multiple Instance Learning (MIL) approach to address the above limitations. In MIL, each nodule is represented by a nested sequence of volumes centered at the identified center of the nodule. We extract one feature vector from each volume. The set of features for each nodule are combined and represented by a bag. Next, we investigate and adapt some existing algorithms and develop new ones for this application. We start by applying benchmark MIL algorithms to traditional Gray Level Co-occurrence Matrix (GLCM) engineered features. Then, we design and train simple Convolutional Neural Networks (CNNs) to learn and extract features that characterize lung nodules. These extracted features are then fed to a benchmark MIL algorithm to learn a classification model. Finally, we develop new algorithms (MIL-CNN) that combine feature learning and multiple instance classification in a single network. These algorithms generalize the CNN architecture to multiple instance data. We design and report the results of three experiments applied on both generative (GLCM) and learned (CNN) features using two datasets (The Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) \cite{armato2011lung} and the National Lung Screening Trial (NLST) \cite{national2011reduced}). Two of these experiments perform five-fold cross-validations on the same dataset (NLST or LIDC). The third experiment trains the algorithms on one collection (NLST dataset) and tests it on the other (LIDC dataset). We designed our experiments to compare the different features, compare MIL versus Single Instance Learning (SIL) where a single feature vector represents a nodule, and compare our proposed end-to-end MIL approaches to existing benchmark MIL methods. We demonstrate that our proposed MIL-CNN frameworks are more accurate for the lung nodules diagnosis task. We also show that MIL representation achieves better results than SIL applied on the ground truth region of each nodule

    Bayesian image restoration and bacteria detection in optical endomicroscopy

    Get PDF
    Optical microscopy systems can be used to obtain high-resolution microscopic images of tissue cultures and ex vivo tissue samples. This imaging technique can be translated for in vivo, in situ applications by using optical fibres and miniature optics. Fibred optical endomicroscopy (OEM) can enable optical biopsy in organs inaccessible by any other imaging systems, and hence can provide rapid and accurate diagnosis in a short time. The raw data the system produce is difficult to interpret as it is modulated by a fibre bundle pattern, producing what is called the “honeycomb effect”. Moreover, the data is further degraded due to the fibre core cross coupling problem. On the other hand, there is an unmet clinical need for automatic tools that can help the clinicians to detect fluorescently labelled bacteria in distal lung images. The aim of this thesis is to develop advanced image processing algorithms that can address the above mentioned problems. First, we provide a statistical model for the fibre core cross coupling problem and the sparse sampling by imaging fibre bundles (honeycomb artefact), which are formulated here as a restoration problem for the first time in the literature. We then introduce a non-linear interpolation method, based on Gaussian processes regression, in order to recover an interpretable scene from the deconvolved data. Second, we develop two bacteria detection algorithms, each of which provides different characteristics. The first approach considers joint formulation to the sparse coding and anomaly detection problems. The anomalies here are considered as candidate bacteria, which are annotated with the help of a trained clinician. Although this approach provides good detection performance and outperforms existing methods in the literature, the user has to carefully tune some crucial model parameters. Hence, we propose a more adaptive approach, for which a Bayesian framework is adopted. This approach not only outperforms the proposed supervised approach and existing methods in the literature but also provides computation time that competes with optimization-based methods

    The role of deep learning in structural and functional lung imaging

    Get PDF
    Background: Structural and functional lung imaging are critical components of pulmonary patient care. Image analysis methods, such as image segmentation, applied to structural and functional lung images, have significant benefits for patients with lung pathologies, including the computation of clinical biomarkers. Traditionally, machine learning (ML) approaches, such as clustering, and computational modelling techniques, such as CT-ventilation imaging, have been used for segmentation and synthesis, respectively. Deep learning (DL) has shown promise in medical image analysis tasks, often outperforming alternative methods. Purpose: To address the hypothesis that DL can outperform conventional ML and classical image analysis methods for the segmentation and synthesis of structural and functional lung imaging via: i. development and comparison of 3D convolutional neural networks (CNNs) for the segmentation of ventilated lung using hyperpolarised (HP) gas MRI. ii. development of a generalisable, multi-centre CNN for segmentation of the lung cavity using 1H-MRI. iii. the proposal of a framework for estimating the lung cavity in the spatial domain of HP gas MRI. iv. development of a workflow to synthesise HP gas MRI from multi-inflation, non-contrast CT. v. the proposal of a framework for the synthesis of fully-volumetric HP gas MRI ventilation from a large, diverse dataset of non-contrast, multi-inflation 1H-MRI scans. Methods: i. A 3D CNN-based method for the segmentation of ventilated lung using HP gas MRI was developed and CNN parameters, such as architecture, loss function and pre-processing were optimised. ii. A 3D CNN trained on a multi-acquisition dataset and validated on data from external centres was compared with a 2D alternative for the segmentation of the lung cavity using 1H-MRI. iii. A dual-channel, multi-modal segmentation framework was compared to single-channel approaches for estimation of the lung cavity in the domain of HP gas MRI. iv. A hybrid data-driven and model-based approach for the synthesis of HP gas MRI ventilation from CT was compared to approaches utilising DL or computational modelling alone. v. A physics-constrained, multi-channel framework for the synthesis of fully-volumetric ventilation surrogates from 1H-MRI was validated using five-fold cross-validation and an external test data set. Results: i. The 3D CNN, developed via parameterisation experiments, accurately segmented ventilation scans and outperformed conventional ML methods. ii. The 3D CNN produced more accurate segmentations than its 2D analogues for the segmentation of the lung cavity, exhibiting minimal variation in performance between centres, vendors and acquisitions. iii. Dual-channel, multi-modal approaches generate significant improvements compared to methods which use a single imaging modality for the estimation of the lung cavity. iv. The hybrid approach produced synthetic ventilation scans which correlate with HP gas MRI. v. The physics-constrained, 3D multi-channel synthesis framework outperformed approaches which did not integrate computational modelling, demonstrating generalisability to external data. Conclusion: DL approaches demonstrate the ability to segment and synthesise lung MRI across a range of modalities and pulmonary pathologies. These methods outperform computational modelling and classical ML approaches, reducing the time required to adequately edit segmentations and improving the modelling of synthetic ventilation, which may facilitate the clinical translation of DL in structural and functional lung imaging

    Texture Analysis and Machine Learning to Predict Pulmonary Ventilation from Thoracic Computed Tomography

    Get PDF
    Chronic obstructive pulmonary disease (COPD) leads to persistent airflow limitation, causing a large burden to patients and the health care system. Thoracic CT provides an opportunity to observe the structural pathophysiology of COPD, whereas hyperpolarized gas MRI provides images of the consequential ventilation heterogeneity. However, hyperpolarized gas MRI is currently limited to research centres, due to the high cost of gas and polarization equipment. Therefore, I developed a pipeline using texture analysis and machine learning methods to create predicted ventilation maps based on non-contrast enhanced, single-volume thoracic CT. In a COPD cohort, predicted ventilation maps were qualitatively and quantitatively related to ground-truth MRI ventilation, and both maps were related to important patient lung function and quality-of-life measures. This study is the first to demonstrate the feasibility of predicting hyperpolarized MRI-based ventilation from single-volume, breath-hold thoracic CT, which has potential to translate pulmonary ventilation information to widely available thoracic CT imaging

    Generating semantically enriched diagnostics for radiological images using machine learning

    Get PDF
    Development of Computer Aided Diagnostic (CAD) tools to aid radiologists in pathology detection and decision making relies considerably on manually annotated images. With the advancement of deep learning techniques for CAD development, these expert annotations no longer need to be hand-crafted, however, deep learning algorithms require large amounts of data in order to generalise well. One way in which to access large volumes of expert-annotated data is through radiological exams consisting of images and reports. Using past radiological exams obtained from hospital archiving systems has many advantages: they are expert annotations available in large quantities, covering a population-representative variety of pathologies, and they provide additional context to pathology diagnoses, such as anatomical location and severity. Learning to auto-generate such reports from images presents many challenges such as the difficulty in representing and generating long, unstructured textual information, accounting for spelling errors and repetition or redundancy, and the inconsistency across different annotators. In this thesis, the problem of learning to automate disease detection from radiological exams is approached from three directions. Firstly, a report generation model is developed such that it is conditioned on radiological image features. Secondly, a number of approaches are explored aimed at extracting diagnostic information from free-text reports. Finally, an alternative approach to image latent space learning from current state-of-the-art is developed that can be applied to accelerated image acquisition.Open Acces

    Advanced Computational Methods for Oncological Image Analysis

    Get PDF
    [Cancer is the second most common cause of death worldwide and encompasses highly variable clinical and biological scenarios. Some of the current clinical challenges are (i) early diagnosis of the disease and (ii) precision medicine, which allows for treatments targeted to specific clinical cases. The ultimate goal is to optimize the clinical workflow by combining accurate diagnosis with the most suitable therapies. Toward this, large-scale machine learning research can define associations among clinical, imaging, and multi-omics studies, making it possible to provide reliable diagnostic and prognostic biomarkers for precision oncology. Such reliable computer-assisted methods (i.e., artificial intelligence) together with clinicians’ unique knowledge can be used to properly handle typical issues in evaluation/quantification procedures (i.e., operator dependence and time-consuming tasks). These technical advances can significantly improve result repeatability in disease diagnosis and guide toward appropriate cancer care. Indeed, the need to apply machine learning and computational intelligence techniques has steadily increased to effectively perform image processing operations—such as segmentation, co-registration, classification, and dimensionality reduction—and multi-omics data integration.

    Automatic Segmentation of Intramedullary Multiple Sclerosis Lesions

    Get PDF
    Contexte: La moelle épinière est un composant essentiel du système nerveux central. Elle contient des neurones responsables d’importantes fonctionnalités et assure la transmission d’informations motrices et sensorielles entre le cerveau et le système nerveux périphérique. Un endommagement de la moelle épinière, causé par un choc ou une maladie neurodégénérative, peut mener à un sérieux handicap, pouvant entraîner des incapacités fonctionnelles, de la paralysie et/ou de la douleur. Chez les patients atteints de sclérose en plaques (SEP), la moelle épinière est fréquemment affectée par de l’atrophie et/ou des lésions. L’imagerie par résonance magnétique (IRM) conventionnelle est largement utilisée par des chercheurs et des cliniciens pour évaluer et caractériser, de façon non-invasive, des altérations micro-structurelles. Une évaluation quantitative des atteintes structurelles portées à la moelle épinière (e.g. sévérité de l’atrophie, extension des lésions) est essentielle pour le diagnostic, le pronostic et la supervision sur le long terme de maladies, telles que la SEP. De plus, le développement de biomarqueurs impartiaux est indispensable pour évaluer l’effet de nouveaux traitements thérapeutiques. La segmentation de la moelle épinière et des lésions intramédullaires de SEP sont, par conséquent, pertinentes d’un point de vue clinique, aussi bien qu’une étape nécessaire vers l’interprétation d’images RM multiparamétriques. Cependant, la segmentation manuelle est une tâche extrêmement chronophage, fastidieuse et sujette à des variations inter- et intra-expert. Il y a par conséquent un besoin d’automatiser les méthodes de segmentations, ce qui pourrait faciliter l’efficacité procédures d’analyses. La segmentation automatique de lésions est compliqué pour plusieurs raisons: (i) la variabilité des lésions en termes de forme, taille et position, (ii) les contours des lésions sont la plupart du temps difficilement discernables, (iii) l’intensité des lésions sur des images MR sont similaires à celles de structures visiblement saines. En plus de cela, réaliser une segmentation rigoureuse sur l’ensemble d’une base de données multi-centrique d’IRM est rendue difficile par l’importante variabilité des protocoles d’acquisition (e.g. résolution, orientation, champ de vue de l’image). Malgré de considérables récents développements dans le traitement d’images MR de moelle épinière, il n’y a toujours pas de méthode disponible pouvant fournir une segmentation rigoureuse et fiable de la moelle épinière pour un large spectre de pathologies et de protocoles d’acquisition. Concernant les lésions intramédullaires, une recherche approfondie dans la littérature n’a pas pu fournir une méthode disponible de segmentation automatique. Objectif: Développer un système complètement automatique pour segmenter la moelle épinière et les lésions intramédullaires sur des IRM conventionnelles humaines. Méthode: L’approche présentée est basée de deux réseaux de neurones à convolution mis en cascade. La méthode a été pensée pour faire face aux principaux obstacles que présentent les données IRM de moelle épinière. Le procédé de segmentation a été entrainé et validé sur une base de données privée composée de 1943 images, acquises dans 30 différents centres avec des protocoles hétérogènes. Les sujets scannés comportent 459 sujets sains, 471 patients SEP et 112 avec d’autres pathologies affectant la moelle épinière. Le module de segmentation de la moelle épinière a été comparé à une méthode existante reconnue par la communauté, PropSeg. Résultats: L’approche basée sur les réseaux de neurones à convolution a fourni de meilleurs résultats que PropSeg, atteignant un Dice médian (intervalle inter-quartiles) de 94.6 (4.6) vs. 87.9 (18.3) %. Pour les lésions, notre segmentation automatique a permis d'obtenir un Dice de 60.0 (21.4) % en le comparant à la segmentation manuelle, un ratio de vrai positifs de 83 (34) %, et une précision de 77 (44) %. Conclusion: Une méthode complètement automatique et innovante pour segmenter la moelle épinière et les lésions SEP intramédullaires sur des données IRM a été conçue durant ce projet de maîtrise. La méthode a été abondamment validée sur une base de données clinique. La robustesse de la méthode de segmentation de moelle épinière a été démontrée, même sur des cas pathologiques. Concernant la segmentation des lésions, les résultats sont encourageants, malgré un taux de faux positifs relativement élevé. Je crois en l’impact que peut potentiellement avoir ces outils pour la communauté de chercheurs. Dans cette optique, les méthodes ont été intégrées et documentées dans un logiciel en accès-ouvert, la “Spinal Cord Toolbox”. Certains des outils développés pendant ce projet de Maîtrise sont déjà utilisés par des analyses d’études cliniques, portant sur des patients SEP et sclérose latérale amyotrophique.----------ABSTRACT Context: The spinal cord is a key component of the central nervous system, which contains neurons responsible for complex functions, and ensures the conduction of motor and sensory information between the brain and the peripheral nervous system. Damage to the spinal cord, through trauma or neurodegenerative diseases, can lead to severe impairment, including functional disabilities, paralysis and/or pain. In multiple sclerosis (MS) patients, the spinal cord is frequently affected by atrophy and/or lesions. Conventional magnetic resonance imaging (MRI) is widely used by researchers and clinicians to non-invasively assess and characterize spinal cord microstructural changes. Quantitative assessment of the structural damage to the spinal cord (e.g. atrophy severity, lesion extent) is essential for the diagnosis, prognosis and longitudinal monitoring of diseases, such as MS. Furthermore, the development of objective biomarkers is essential to evaluate the effect of new therapeutic treatments. Spinal cord and intramedullary MS lesions segmentation is consequently clinically relevant, as well as a necessary step towards the interpretation of multi-parametric MR images. However, manual segmentation is highly time-consuming, tedious and prone to intra- and inter-rater variability. There is therefore a need for automated segmentation methods to facilitate the efficiency of analysis pipelines. Automatic lesion segmentation is challenging for various reasons: (i) lesion variability in terms of shape, size and location, (ii) lesion boundaries are most of the time not well defined, (iii) lesion intensities on MR data are confounding with those of normal-appearing structures. Moreover, achieving robust segmentation across multi-center MRI data is challenging because of the broad variability of data features (e.g. resolution, orientation, field of view). Despite recent substantial developments in spinal cord MRI processing, there is still no method available that can yield robust and reliable spinal cord segmentation across the very diverse spinal pathologies and data features. Regarding the intramedullary lesions, a thorough search of the relevant literature did not yield available method of automatic segmentation. Goal: To develop a fully-automatic framework for segmenting the spinal cord and intramedullary MS lesions from conventional human MRI data. Method: The presented approach is based on a cascade of two Convolutional Neural Networks (CNN). The method has been designed to face the main challenges of ‘real world’ spinal cord MRI data. It was trained and validated on a private dataset made up of 1943 MR volumes, acquired in different 30 sites with heterogeneous acquisition protocols. Scanned subjects involve 459 healthy controls, 471 MS patients and 112 with other spinal pathologies. The proposed spinal cord segmentation method was compared to a state-of-the-art spinal cord segmentation method, PropSeg. Results: The CNN-based approach achieved better results than PropSeg, yielding a median (interquartile range) Dice of 94.6 (4.6) vs. 87.9 (18.3) % when compared to the manual segmentation. For the lesion segmentation task, our method provided a median Dice-overlap with the manual segmentation of 60.0 (21.4) %, a lesion-based true positive rate of 83 (34) % and a lesion-based precision de 77 (44) %. Conclusion: An original fully-automatic method to segment the spinal cord and intramedullary MS lesions on MRI data has been devised during this Master’s project. The method was validated extensively against a clinical dataset. The robustness of the spinal cord segmentation has been demonstrated, even on challenging pathological cases. Regarding the lesion segmentation, the results are encouraging despite the fairly high false positive rate. I believe in the potential value of these developed tools for the research community. In this vein, the methods are integrated and documented into an open-source software, the Spinal Cord Toolbox. Some of the tools developed during this Master’s project are already integrated into automated analysis pipelines of clinical studies, including MS and Amyotrophic Lateral Sclerosis patients

    Computer Aided Dysplasia Grading for Barrett’s Oesophagus Virtual Slides

    Get PDF
    Dysplasia grading in Barrett’s Oesophagus has been an issue among pathologist worldwide. Despite of the increasing number of sufferers every year especially for westerners, dysplasia in Barrett’s Oesophagus can only be graded by a trained pathologist with visual examination. Therefore, we present our work on extracting textural and spatial features from the tissue regions. Our first approach is to extract only the epithelial layer of the tissue, based on the grading rules by pathologists. This is carried out by extracting sub images of a certain window size along the tissue epithelial layer. The textural features of these sub images were used to grade regions into dysplasia or not-dysplasia and we have achieved 82.5% AP with 0.82 precision and 0.86 recall value. Therefore, we have managed to overcame the ‘boundary-effect’ issues that have usually been avoided by selecting or cropping tissue image without the boundary. Secondly, the textural and spatial features of the whole tissue in the region were investigated. Experiments were carried out using Grey Level Co-occurrence Matrices at the pixel-level with a brute-force approach experiment, to cluster patches based on its texture similarities.Then, we have developed a texture-mapping technique that translates the spatial arrangement of tissue texture within a tissue region on the patch-level. As a result, three binary decision tree models were developed from the texture-mapping image, to grade each annotated regions into dysplasia Grade 1, Grade 3 and Grade 5 with 87.5%, 75.0% and 81.3% accuracy percentage with kappa score 0.75, 0.5 and 0.63 respectively. A binary decision tree was then used on the spatial arrangement of the tissue texture types with respect to the epithelial layer to help grade the regions. 75.0%, 68.8% and 68.8% accuracy percentage with kappa value of 0.5, 0.37 and 0.37 were achieved respectively for dysplasia Grade 1, Grade 3 and Grade 5. Based on the result achieved, we can conclude that the spatial information of tissue texture types with regards to the epithelial layer, is not as strong as is on the whole region. The binary decision tree grading models were applied on the broader tissue area; the whole virtual pathology slides itself. The consensus grading for each tissue is calculated with positivity table and scoring method. Finally, we present our own thresholded frequency method to grade virtual slides based on frequency of grading occurrence; and the result were compared to the pathologist’s grading. High agreement score with 0.80 KV was achieved and this is a massive improvement compared to a simple frequency scoring, which is only 0.47 KV
    corecore