74 research outputs found

    A deep learning framework for quality assessment and restoration in video endoscopy

    Full text link
    Endoscopy is a routine imaging technique used for both diagnosis and minimally invasive surgical treatment. Artifacts such as motion blur, bubbles, specular reflections, floating objects and pixel saturation impede the visual interpretation and the automated analysis of endoscopy videos. Given the widespread use of endoscopy in different clinical applications, we contend that the robust and reliable identification of such artifacts and the automated restoration of corrupted video frames is a fundamental medical imaging problem. Existing state-of-the-art methods only deal with the detection and restoration of selected artifacts. However, typically endoscopy videos contain numerous artifacts which motivates to establish a comprehensive solution. We propose a fully automatic framework that can: 1) detect and classify six different primary artifacts, 2) provide a quality score for each frame and 3) restore mildly corrupted frames. To detect different artifacts our framework exploits fast multi-scale, single stage convolutional neural network detector. We introduce a quality metric to assess frame quality and predict image restoration success. Generative adversarial networks with carefully chosen regularization are finally used to restore corrupted frames. Our detector yields the highest mean average precision (mAP at 5% threshold) of 49.0 and the lowest computational time of 88 ms allowing for accurate real-time processing. Our restoration models for blind deblurring, saturation correction and inpainting demonstrate significant improvements over previous methods. On a set of 10 test videos we show that our approach preserves an average of 68.7% which is 25% more frames than that retained from the raw videos.Comment: 14 page

    Novel Deep Learning Models for Medical Imaging Analysis

    Get PDF
    abstract: Deep learning is a sub-field of machine learning in which models are developed to imitate the workings of the human brain in processing data and creating patterns for decision making. This dissertation is focused on developing deep learning models for medical imaging analysis of different modalities for different tasks including detection, segmentation and classification. Imaging modalities including digital mammography (DM), magnetic resonance imaging (MRI), positron emission tomography (PET) and computed tomography (CT) are studied in the dissertation for various medical applications. The first phase of the research is to develop a novel shallow-deep convolutional neural network (SD-CNN) model for improved breast cancer diagnosis. This model takes one type of medical image as input and synthesizes different modalities for additional feature sources; both original image and synthetic image are used for feature generation. This proposed architecture is validated in the application of breast cancer diagnosis and proved to be outperforming the competing models. Motivated by the success from the first phase, the second phase focuses on improving medical imaging synthesis performance with advanced deep learning architecture. A new architecture named deep residual inception encoder-decoder network (RIED-Net) is proposed. RIED-Net has the advantages of preserving pixel-level information and cross-modality feature transferring. The applicability of RIED-Net is validated in breast cancer diagnosis and Alzheimer’s disease (AD) staging. Recognizing medical imaging research often has multiples inter-related tasks, namely, detection, segmentation and classification, my third phase of the research is to develop a multi-task deep learning model. Specifically, a feature transfer enabled multi-task deep learning model (FT-MTL-Net) is proposed to transfer high-resolution features from segmentation task to low-resolution feature-based classification task. The application of FT-MTL-Net on breast cancer detection, segmentation and classification using DM images is studied. As a continuing effort on exploring the transfer learning in deep models for medical application, the last phase is to develop a deep learning model for both feature transfer and knowledge from pre-training age prediction task to new domain of Mild cognitive impairment (MCI) to AD conversion prediction task. It is validated in the application of predicting MCI patients’ conversion to AD with 3D MRI images.Dissertation/ThesisDoctoral Dissertation Industrial Engineering 201

    Deep learning in diabetic foot ulcers detection: A comprehensive evaluation

    Get PDF
    There has been a substantial amount of research involving computer methods and technology for the detection and recognition of diabetic foot ulcers (DFUs), but there is a lack of systematic comparisons of state-of-the-art deep learning object detection frameworks applied to this problem. DFUC2020 provided participants with a comprehensive dataset consisting of 2,000 images for training and 2,000 images for testing. This paper summarizes the results of DFUC2020 by comparing the deep learning-based algorithms proposed by the winning teams: Faster R–CNN, three variants of Faster R–CNN and an ensemble method; YOLOv3; YOLOv5; EfficientDet; and a new Cascade Attention Network. For each deep learning method, we provide a detailed description of model architecture, parameter settings for training and additional stages including pre-processing, data augmentation and post-processing. We provide a comprehensive evaluation for each method. All the methods required a data augmentation stage to increase the number of images available for training and a post-processing stage to remove false positives. The best performance was obtained from Deformable Convolution, a variant of Faster R–CNN, with a mean average precision (mAP) of 0.6940 and an F1-Score of 0.7434. Finally, we demonstrate that the ensemble method based on different deep learning methods can enhance the F1-Score but not the mAP

    Toward robust deep neural networks

    Get PDF
    Dans cette thĂšse, notre objectif est de dĂ©velopper des modĂšles d’apprentissage robustes et fiables mais prĂ©cis, en particulier les Convolutional Neural Network (CNN), en prĂ©sence des exemples anomalies, comme des exemples adversaires et d’échantillons hors distribution –Out-of-Distribution (OOD). Comme la premiĂšre contribution, nous proposons d’estimer la confiance calibrĂ©e pour les exemples adversaires en encourageant la diversitĂ© dans un ensemble des CNNs. À cette fin, nous concevons un ensemble de spĂ©cialistes diversifiĂ©s avec un mĂ©canisme de vote simple et efficace en termes de calcul pour prĂ©dire les exemples adversaires avec une faible confiance tout en maintenant la confiance prĂ©dicative des Ă©chantillons propres Ă©levĂ©e. En prĂ©sence de dĂ©saccord dans notre ensemble, nous prouvons qu’une borne supĂ©rieure de 0:5 + _0 peut ĂȘtre Ă©tablie pour la confiance, conduisant Ă  un seuil de dĂ©tection global fixe de tau = 0; 5. Nous justifions analytiquement le rĂŽle de la diversitĂ© dans notre ensemble sur l’attĂ©nuation du risque des exemples adversaires Ă  la fois en boĂźte noire et en boĂźte blanche. Enfin, nous Ă©valuons empiriquement la robustesse de notre ensemble aux attaques de la boĂźte noire et de la boĂźte blanche sur plusieurs donnĂ©es standards. La deuxiĂšme contribution vise Ă  aborder la dĂ©tection d’échantillons OOD Ă  travers un modĂšle de bout en bout entraĂźnĂ© sur un ensemble OOD appropriĂ©. À cette fin, nous abordons la question centrale suivante : comment diffĂ©rencier des diffĂ©rents ensembles de donnĂ©es OOD disponibles par rapport Ă  une tĂąche de distribution donnĂ©e pour sĂ©lectionner la plus appropriĂ©e, ce qui induit Ă  son tour un modĂšle calibrĂ© avec un taux de dĂ©tection des ensembles inaperçus de donnĂ©es OOD? Pour rĂ©pondre Ă  cette question, nous proposons de diffĂ©rencier les ensembles OOD par leur niveau de "protection" des sub-manifolds. Pour mesurer le niveau de protection, nous concevons ensuite trois nouvelles mesures efficaces en termes de calcul Ă  l’aide d’un CNN vanille prĂ©formĂ©. Dans une vaste sĂ©rie d’expĂ©riences sur les tĂąches de classification d’image et d’audio, nous dĂ©montrons empiriquement la capacitĂ© d’un CNN augmentĂ© (A-CNN) et d’un CNN explicitement calibrĂ© pour dĂ©tecter une portion significativement plus grande des exemples OOD. Fait intĂ©ressant, nous observons Ă©galement qu’un tel A-CNN (nommĂ© A-CNN) peut Ă©galement dĂ©tecter les adversaires exemples FGS en boĂźte noire avec des perturbations significatives. En tant que troisiĂšme contribution, nous Ă©tudions de plus prĂšs de la capacitĂ© de l’A-CNN sur la dĂ©tection de types plus larges d’adversaires boĂźte noire (pas seulement ceux de type FGS). Pour augmenter la capacitĂ© d’A-CNN Ă  dĂ©tecter un plus grand nombre d’adversaires,nous augmentons l’ensemble d’entraĂźnement OOD avec des Ă©chantillons interpolĂ©s inter-classes. Ensuite, nous dĂ©montrons que l’A-CNN, entraĂźnĂ© sur tous ces donnĂ©es, a un taux de dĂ©tection cohĂ©rent sur tous les types des adversaires exemples invisibles. Alors que la entraĂźnement d’un A-CNN sur des adversaires PGD ne conduit pas Ă  un taux de dĂ©tection stable sur tous les types d’adversaires, en particulier les types inaperçus. Nous Ă©valuons Ă©galement visuellement l’espace des fonctionnalitĂ©s et les limites de dĂ©cision dans l’espace d’entrĂ©e d’un CNN vanille et de son homologue augmentĂ© en prĂ©sence d’adversaires et de ceux qui sont propres. Par un A-CNN correctement formĂ©, nous visons Ă  faire un pas vers un modĂšle d’apprentissage debout en bout unifiĂ© et fiable avec de faibles taux de risque sur les Ă©chantillons propres et les Ă©chantillons inhabituels, par exemple, les Ă©chantillons adversaires et OOD. La derniĂšre contribution est de prĂ©senter une application de A-CNN pour l’entraĂźnement d’un dĂ©tecteur d’objet robuste sur un ensemble de donnĂ©es partiellement Ă©tiquetĂ©es, en particulier un ensemble de donnĂ©es fusionnĂ©. La fusion de divers ensembles de donnĂ©es provenant de contextes similaires mais avec diffĂ©rents ensembles d’objets d’intĂ©rĂȘt (OoI) est un moyen peu coĂ»teux de crĂ©er un ensemble de donnĂ©es Ă  grande Ă©chelle qui couvre un plus large spectre d’OoI. De plus, la fusion d’ensembles de donnĂ©es permet de rĂ©aliser un dĂ©tecteur d’objet unifiĂ©, au lieu d’en avoir plusieurs sĂ©parĂ©s, ce qui entraĂźne une rĂ©duction des coĂ»ts de calcul et de temps. Cependant, la fusion d’ensembles de donnĂ©es, en particulier Ă  partir d’un contexte similaire, entraĂźne de nombreuses instances d’étiquetĂ©es manquantes. Dans le but d’entraĂźner un dĂ©tecteur d’objet robuste intĂ©grĂ© sur un ensemble de donnĂ©es partiellement Ă©tiquetĂ©es mais Ă  grande Ă©chelle, nous proposons un cadre d’entraĂźnement auto-supervisĂ© pour surmonter le problĂšme des instances d’étiquettes manquantes dans les ensembles des donnĂ©es fusionnĂ©s. Notre cadre est Ă©valuĂ© sur un ensemble de donnĂ©es fusionnĂ© avec un taux Ă©levĂ© d’étiquettes manquantes. Les rĂ©sultats empiriques confirment la viabilitĂ© de nos pseudo-Ă©tiquettes gĂ©nĂ©rĂ©es pour amĂ©liorer les performances de YOLO, en tant que dĂ©tecteur d’objet Ă  la pointe de la technologie.In this thesis, our goal is to develop robust and reliable yet accurate learning models, particularly Convolutional Neural Networks (CNNs), in the presence of adversarial examples and Out-of-Distribution (OOD) samples. As the first contribution, we propose to predict adversarial instances with high uncertainty through encouraging diversity in an ensemble of CNNs. To this end, we devise an ensemble of diverse specialists along with a simple and computationally efficient voting mechanism to predict the adversarial examples with low confidence while keeping the predictive confidence of the clean samples high. In the presence of high entropy in our ensemble, we prove that the predictive confidence can be upper-bounded, leading to have a globally fixed threshold over the predictive confidence for identifying adversaries. We analytically justify the role of diversity in our ensemble on mitigating the risk of both black-box and white-box adversarial examples. Finally, we empirically assess the robustness of our ensemble to the black-box and the white-box attacks on several benchmark datasets.The second contribution aims to address the detection of OOD samples through an end-to-end model trained on an appropriate OOD set. To this end, we address the following central question: how to differentiate many available OOD sets w.r.t. a given in distribution task to select the most appropriate one, which in turn induces a model with a high detection rate of unseen OOD sets? To answer this question, we hypothesize that the “protection” level of in-distribution sub-manifolds by each OOD set can be a good possible property to differentiate OOD sets. To measure the protection level, we then design three novel, simple, and cost-effective metrics using a pre-trained vanilla CNN. In an extensive series of experiments on image and audio classification tasks, we empirically demonstrate the abilityof an Augmented-CNN (A-CNN) and an explicitly-calibrated CNN for detecting a significantly larger portion of unseen OOD samples, if they are trained on the most protective OOD set. Interestingly, we also observe that the A-CNN trained on the most protective OOD set (calledA-CNN) can also detect the black-box Fast Gradient Sign (FGS) adversarial examples. As the third contribution, we investigate more closely the capacity of the A-CNN on the detection of wider types of black-box adversaries. To increase the capability of A-CNN to detect a larger number of adversaries, we augment its OOD training set with some inter-class interpolated samples. Then, we demonstrate that the A-CNN trained on the most protective OOD set along with the interpolated samples has a consistent detection rate on all types of unseen adversarial examples. Where as training an A-CNN on Projected Gradient Descent (PGD) adversaries does not lead to a stable detection rate on all types of adversaries, particularly the unseen types. We also visually assess the feature space and the decision boundaries in the input space of a vanilla CNN and its augmented counterpart in the presence of adversaries and the clean ones. By a properly trained A-CNN, we aim to take a step toward a unified and reliable end-to-end learning model with small risk rates on both clean samples and the unusual ones, e.g. adversarial and OOD samples.The last contribution is to show a use-case of A-CNN for training a robust object detector on a partially-labeled dataset, particularly a merged dataset. Merging various datasets from similar contexts but with different sets of Object of Interest (OoI) is an inexpensive way to craft a large-scale dataset which covers a larger spectrum of OoIs. Moreover, merging datasets allows achieving a unified object detector, instead of having several separate ones, resultingin the reduction of computational and time costs. However, merging datasets, especially from a similar context, causes many missing-label instances. With the goal of training an integrated robust object detector on a partially-labeled but large-scale dataset, we propose a self-supervised training framework to overcome the issue of missing-label instances in the merged datasets. Our framework is evaluated on a merged dataset with a high missing-label rate. The empirical results confirm the viability of our generated pseudo-labels to enhance the performance of YOLO, as the current (to date) state-of-the-art object detector
    • 

    corecore