74 research outputs found
A deep learning framework for quality assessment and restoration in video endoscopy
Endoscopy is a routine imaging technique used for both diagnosis and
minimally invasive surgical treatment. Artifacts such as motion blur, bubbles,
specular reflections, floating objects and pixel saturation impede the visual
interpretation and the automated analysis of endoscopy videos. Given the
widespread use of endoscopy in different clinical applications, we contend that
the robust and reliable identification of such artifacts and the automated
restoration of corrupted video frames is a fundamental medical imaging problem.
Existing state-of-the-art methods only deal with the detection and restoration
of selected artifacts. However, typically endoscopy videos contain numerous
artifacts which motivates to establish a comprehensive solution.
We propose a fully automatic framework that can: 1) detect and classify six
different primary artifacts, 2) provide a quality score for each frame and 3)
restore mildly corrupted frames. To detect different artifacts our framework
exploits fast multi-scale, single stage convolutional neural network detector.
We introduce a quality metric to assess frame quality and predict image
restoration success. Generative adversarial networks with carefully chosen
regularization are finally used to restore corrupted frames.
Our detector yields the highest mean average precision (mAP at 5% threshold)
of 49.0 and the lowest computational time of 88 ms allowing for accurate
real-time processing. Our restoration models for blind deblurring, saturation
correction and inpainting demonstrate significant improvements over previous
methods. On a set of 10 test videos we show that our approach preserves an
average of 68.7% which is 25% more frames than that retained from the raw
videos.Comment: 14 page
Novel Deep Learning Models for Medical Imaging Analysis
abstract: Deep learning is a sub-field of machine learning in which models are developed to imitate the workings of the human brain in processing data and creating patterns for decision making. This dissertation is focused on developing deep learning models for medical imaging analysis of different modalities for different tasks including detection, segmentation and classification. Imaging modalities including digital mammography (DM), magnetic resonance imaging (MRI), positron emission tomography (PET) and computed tomography (CT) are studied in the dissertation for various medical applications. The first phase of the research is to develop a novel shallow-deep convolutional neural network (SD-CNN) model for improved breast cancer diagnosis. This model takes one type of medical image as input and synthesizes different modalities for additional feature sources; both original image and synthetic image are used for feature generation. This proposed architecture is validated in the application of breast cancer diagnosis and proved to be outperforming the competing models. Motivated by the success from the first phase, the second phase focuses on improving medical imaging synthesis performance with advanced deep learning architecture. A new architecture named deep residual inception encoder-decoder network (RIED-Net) is proposed. RIED-Net has the advantages of preserving pixel-level information and cross-modality feature transferring. The applicability of RIED-Net is validated in breast cancer diagnosis and Alzheimerâs disease (AD) staging. Recognizing medical imaging research often has multiples inter-related tasks, namely, detection, segmentation and classification, my third phase of the research is to develop a multi-task deep learning model. Specifically, a feature transfer enabled multi-task deep learning model (FT-MTL-Net) is proposed to transfer high-resolution features from segmentation task to low-resolution feature-based classification task. The application of FT-MTL-Net on breast cancer detection, segmentation and classification using DM images is studied. As a continuing effort on exploring the transfer learning in deep models for medical application, the last phase is to develop a deep learning model for both feature transfer and knowledge from pre-training age prediction task to new domain of Mild cognitive impairment (MCI) to AD conversion prediction task. It is validated in the application of predicting MCI patientsâ conversion to AD with 3D MRI images.Dissertation/ThesisDoctoral Dissertation Industrial Engineering 201
Recommended from our members
Computer-aided diagnosis system for bone fracture detection using machine learning algorithms
Diagnostic imaging technology has revolutionized the healthcare industry by allowing more accurate and earlier diagnosis of diseases. This technology reduces the need for invasive procedures such as surgery and enhances the quality of patient care. Several machine learning algorithms like SVM, K-mean clustering and UNET have been demonstrated to be capable of solving classification, detection, and segmentation problems in medical imaging, as well as being used for super-resolution techniques. The purpose of this thesis is to examine machine learning and image processing methods for four key challenges in medical image analysis.
The first one is the segmentation of medical images. The second challenge involves implementing super-resolution techniques for medical images. Third, using image processing methods in order to diagnose the abnormalities. The fourth contribution is to enrich the image information by mapping of medical images between different modalities using deep neural models. In this research, all contributions aim at developing an end-to-end model that can detect fractures automatically or be used as a clinical assistant to reduce errors. As the first contribution, the thesis presents a multi-stage novel approach for bone segmentation in X-ray images using faster region-based convolutional neural network (R-CNN) and distance regularized level set evolution (DRLSE) algorithms. A hybrid model utilizing deep neural network (DNN) and image processing techniques are proposed to segment the bones in two stages. Our model is more robust to the changes in X-ray images, as well as applicable to bones that are misplaced. Additionally, we have used transfer learning to reduce the amount of time and effort required to collect and label the data. As the second contribution, DNN models are used to enhance the resolution of medical images. CNN and generative adversarial network have been used as super-resolution techniques to achieve high-resolution medical images. The analysis includes subjective and objective evaluations of different models on regions with or without fractures to compare them with our model. The third contribution involves applying different image analysis methods to X-ray images in order to detect fractures with the minimum amount of human intervention. By using entropy and intensity, we have also attempted to identify regions of interest that have a higher probability of having fractures. We also evaluate the effect of super-resolution technique on the saliency map with and without fractures. Lastly, we present image-to-image mapping by using variational autoencoders and generative adversarial networks to reduce the cost of diagnosis and medical images retrievals. We have attempted to map X-ray images to MRIs in this section in order to fuse the high diagnostic information existing in MRIs, for enhancing the matched X-ray images
Deep learning in diabetic foot ulcers detection: A comprehensive evaluation
There has been a substantial amount of research involving computer methods and technology for the detection and recognition of diabetic foot ulcers (DFUs), but there is a lack of systematic comparisons of state-of-the-art deep learning object detection frameworks applied to this problem. DFUC2020 provided participants with a comprehensive dataset consisting of 2,000 images for training and 2,000 images for testing. This paper summarizes the results of DFUC2020 by comparing the deep learning-based algorithms proposed by the winning teams: Faster RâCNN, three variants of Faster RâCNN and an ensemble method; YOLOv3; YOLOv5; EfficientDet; and a new Cascade Attention Network. For each deep learning method, we provide a detailed description of model architecture, parameter settings for training and additional stages including pre-processing, data augmentation and post-processing. We provide a comprehensive evaluation for each method. All the methods required a data augmentation stage to increase the number of images available for training and a post-processing stage to remove false positives. The best performance was obtained from Deformable Convolution, a variant of Faster RâCNN, with a mean average precision (mAP) of 0.6940 and an F1-Score of 0.7434. Finally, we demonstrate that the ensemble method based on different deep learning methods can enhance the F1-Score but not the mAP
Toward robust deep neural networks
Dans cette thĂšse, notre objectif est de dĂ©velopper des modĂšles dâapprentissage robustes et fiables mais prĂ©cis, en particulier les Convolutional Neural Network (CNN), en prĂ©sence des exemples anomalies, comme des exemples adversaires et dâĂ©chantillons hors distribution âOut-of-Distribution (OOD). Comme la premiĂšre contribution, nous proposons dâestimer la confiance calibrĂ©e pour les exemples adversaires en encourageant la diversitĂ© dans un ensemble des CNNs. Ă cette fin, nous concevons un ensemble de spĂ©cialistes diversifiĂ©s avec un mĂ©canisme de vote simple et efficace en termes de calcul pour prĂ©dire les exemples adversaires avec une faible confiance tout en maintenant la confiance prĂ©dicative des Ă©chantillons propres Ă©levĂ©e. En prĂ©sence de dĂ©saccord dans notre ensemble, nous prouvons quâune borne supĂ©rieure de 0:5 + _0 peut ĂȘtre Ă©tablie pour la confiance, conduisant Ă un seuil de dĂ©tection global fixe de tau = 0; 5. Nous justifions analytiquement le rĂŽle de la diversitĂ© dans notre ensemble sur lâattĂ©nuation du risque des exemples adversaires Ă la fois en boĂźte noire et en boĂźte blanche. Enfin, nous Ă©valuons empiriquement la robustesse de notre ensemble aux attaques de la boĂźte noire et de la boĂźte blanche sur plusieurs donnĂ©es standards. La deuxiĂšme contribution vise Ă aborder la dĂ©tection dâĂ©chantillons OOD Ă travers un modĂšle de bout en bout entraĂźnĂ© sur un ensemble OOD appropriĂ©. Ă cette fin, nous abordons la question centrale suivante : comment diffĂ©rencier des diffĂ©rents ensembles de donnĂ©es OOD disponibles par rapport Ă une tĂąche de distribution donnĂ©e pour sĂ©lectionner la plus appropriĂ©e, ce qui induit Ă son tour un modĂšle calibrĂ© avec un taux de dĂ©tection des ensembles inaperçus de donnĂ©es OOD? Pour rĂ©pondre Ă cette question, nous proposons de diffĂ©rencier les ensembles OOD par leur niveau de "protection" des sub-manifolds. Pour mesurer le niveau de protection, nous concevons ensuite trois nouvelles mesures efficaces en termes de calcul Ă lâaide dâun CNN vanille prĂ©formĂ©. Dans une vaste sĂ©rie dâexpĂ©riences sur les tĂąches de classification dâimage et dâaudio, nous dĂ©montrons empiriquement la capacitĂ© dâun CNN augmentĂ© (A-CNN) et dâun CNN explicitement calibrĂ© pour dĂ©tecter une portion significativement plus grande des exemples OOD. Fait intĂ©ressant, nous observons Ă©galement quâun tel A-CNN (nommĂ© A-CNN) peut Ă©galement dĂ©tecter les adversaires exemples FGS en boĂźte noire avec des perturbations significatives. En tant que troisiĂšme contribution, nous Ă©tudions de plus prĂšs de la capacitĂ© de lâA-CNN sur la dĂ©tection de types plus larges dâadversaires boĂźte noire (pas seulement ceux de type FGS). Pour augmenter la capacitĂ© dâA-CNN Ă dĂ©tecter un plus grand nombre dâadversaires,nous augmentons lâensemble dâentraĂźnement OOD avec des Ă©chantillons interpolĂ©s inter-classes. Ensuite, nous dĂ©montrons que lâA-CNN, entraĂźnĂ© sur tous ces donnĂ©es, a un taux de dĂ©tection cohĂ©rent sur tous les types des adversaires exemples invisibles. Alors que la entraĂźnement dâun A-CNN sur des adversaires PGD ne conduit pas Ă un taux de dĂ©tection stable sur tous les types dâadversaires, en particulier les types inaperçus. Nous Ă©valuons Ă©galement visuellement lâespace des fonctionnalitĂ©s et les limites de dĂ©cision dans lâespace dâentrĂ©e dâun CNN vanille et de son homologue augmentĂ© en prĂ©sence dâadversaires et de ceux qui sont propres. Par un A-CNN correctement formĂ©, nous visons Ă faire un pas vers un modĂšle dâapprentissage debout en bout unifiĂ© et fiable avec de faibles taux de risque sur les Ă©chantillons propres et les Ă©chantillons inhabituels, par exemple, les Ă©chantillons adversaires et OOD. La derniĂšre contribution est de prĂ©senter une application de A-CNN pour lâentraĂźnement dâun dĂ©tecteur dâobjet robuste sur un ensemble de donnĂ©es partiellement Ă©tiquetĂ©es, en particulier un ensemble de donnĂ©es fusionnĂ©. La fusion de divers ensembles de donnĂ©es provenant de contextes similaires mais avec diffĂ©rents ensembles dâobjets dâintĂ©rĂȘt (OoI) est un moyen peu coĂ»teux de crĂ©er un ensemble de donnĂ©es Ă grande Ă©chelle qui couvre un plus large spectre dâOoI. De plus, la fusion dâensembles de donnĂ©es permet de rĂ©aliser un dĂ©tecteur dâobjet unifiĂ©, au lieu dâen avoir plusieurs sĂ©parĂ©s, ce qui entraĂźne une rĂ©duction des coĂ»ts de calcul et de temps. Cependant, la fusion dâensembles de donnĂ©es, en particulier Ă partir dâun contexte similaire, entraĂźne de nombreuses instances dâĂ©tiquetĂ©es manquantes. Dans le but dâentraĂźner un dĂ©tecteur dâobjet robuste intĂ©grĂ© sur un ensemble de donnĂ©es partiellement Ă©tiquetĂ©es mais Ă grande Ă©chelle, nous proposons un cadre dâentraĂźnement auto-supervisĂ© pour surmonter le problĂšme des instances dâĂ©tiquettes manquantes dans les ensembles des donnĂ©es fusionnĂ©s. Notre cadre est Ă©valuĂ© sur un ensemble de donnĂ©es fusionnĂ© avec un taux Ă©levĂ© dâĂ©tiquettes manquantes. Les rĂ©sultats empiriques confirment la viabilitĂ© de nos pseudo-Ă©tiquettes gĂ©nĂ©rĂ©es pour amĂ©liorer les performances de YOLO, en tant que dĂ©tecteur dâobjet Ă la pointe de la technologie.In this thesis, our goal is to develop robust and reliable yet accurate learning models, particularly Convolutional Neural Networks (CNNs), in the presence of adversarial examples and Out-of-Distribution (OOD) samples. As the first contribution, we propose to predict adversarial instances with high uncertainty through encouraging diversity in an ensemble of CNNs. To this end, we devise an ensemble of diverse specialists along with a simple and computationally efficient voting mechanism to predict the adversarial examples with low confidence while keeping the predictive confidence of the clean samples high. In the presence of high entropy in our ensemble, we prove that the predictive confidence can be upper-bounded, leading to have a globally fixed threshold over the predictive confidence for identifying adversaries. We analytically justify the role of diversity in our ensemble on mitigating the risk of both black-box and white-box adversarial examples. Finally, we empirically assess the robustness of our ensemble to the black-box and the white-box attacks on several benchmark datasets.The second contribution aims to address the detection of OOD samples through an end-to-end model trained on an appropriate OOD set. To this end, we address the following central question: how to differentiate many available OOD sets w.r.t. a given in distribution task to select the most appropriate one, which in turn induces a model with a high detection rate of unseen OOD sets? To answer this question, we hypothesize that the âprotectionâ level of in-distribution sub-manifolds by each OOD set can be a good possible property to differentiate OOD sets. To measure the protection level, we then design three novel, simple, and cost-effective metrics using a pre-trained vanilla CNN. In an extensive series of experiments on image and audio classification tasks, we empirically demonstrate the abilityof an Augmented-CNN (A-CNN) and an explicitly-calibrated CNN for detecting a significantly larger portion of unseen OOD samples, if they are trained on the most protective OOD set. Interestingly, we also observe that the A-CNN trained on the most protective OOD set (calledA-CNN) can also detect the black-box Fast Gradient Sign (FGS) adversarial examples. As the third contribution, we investigate more closely the capacity of the A-CNN on the detection of wider types of black-box adversaries. To increase the capability of A-CNN to detect a larger number of adversaries, we augment its OOD training set with some inter-class interpolated samples. Then, we demonstrate that the A-CNN trained on the most protective OOD set along with the interpolated samples has a consistent detection rate on all types of unseen adversarial examples. Where as training an A-CNN on Projected Gradient Descent (PGD) adversaries does not lead to a stable detection rate on all types of adversaries, particularly the unseen types. We also visually assess the feature space and the decision boundaries in the input space of a vanilla CNN and its augmented counterpart in the presence of adversaries and the clean ones. By a properly trained A-CNN, we aim to take a step toward a unified and reliable end-to-end learning model with small risk rates on both clean samples and the unusual ones, e.g. adversarial and OOD samples.The last contribution is to show a use-case of A-CNN for training a robust object detector on a partially-labeled dataset, particularly a merged dataset. Merging various datasets from similar contexts but with different sets of Object of Interest (OoI) is an inexpensive way to craft a large-scale dataset which covers a larger spectrum of OoIs. Moreover, merging datasets allows achieving a unified object detector, instead of having several separate ones, resultingin the reduction of computational and time costs. However, merging datasets, especially from a similar context, causes many missing-label instances. With the goal of training an integrated robust object detector on a partially-labeled but large-scale dataset, we propose a self-supervised training framework to overcome the issue of missing-label instances in the merged datasets. Our framework is evaluated on a merged dataset with a high missing-label rate. The empirical results confirm the viability of our generated pseudo-labels to enhance the performance of YOLO, as the current (to date) state-of-the-art object detector
- âŠ