54 research outputs found
Towards Complex Backgrounds: A Unified Difference-Aware Decoder for Binary Segmentation
Binary segmentation is used to distinguish objects of interest from
background, and is an active area of convolutional encoder-decoder network
research. The current decoders are designed for specific objects based on the
common backbones as the encoders, but cannot deal with complex backgrounds.
Inspired by the way human eyes detect objects of interest, a new unified
dual-branch decoder paradigm named the difference-aware decoder is proposed in
this paper to explore the difference between the foreground and the background
and separate the objects of interest in optical images. The difference-aware
decoder imitates the human eye in three stages using the multi-level features
output by the encoder. In Stage A, the first branch decoder of the
difference-aware decoder is used to obtain a guide map. The highest-level
features are enhanced with a novel field expansion module and a dual residual
attention module, and are combined with the lowest-level features to obtain the
guide map. In Stage B, the other branch decoder adopts a middle feature fusion
module to make trade-offs between textural details and semantic information and
generate background-aware features. In Stage C, the proposed difference-aware
extractor, consisting of a difference guidance model and a difference
enhancement module, fuses the guide map from Stage A and the background-aware
features from Stage B, to enlarge the differences between the foreground and
the background and output a final detection result. The results demonstrate
that the difference-aware decoder can achieve a higher accuracy than the other
state-of-the-art binary segmentation methods for these tasks
Self-supervised out-of-distribution detection in wireless capsule endoscopy images.
While deep learning has displayed excellent performance in a broad spectrum of application areas, neural networks still struggle to recognize what they have not seen, i.e., out-of-distribution (OOD) inputs. In the medical field, building robust models that are able to detect OOD images is highly critical, as these rare images could show diseases or anomalies that should be detected. In this study, we use wireless capsule endoscopy (WCE) images to present a novel patch-based self-supervised approach comprising three stages. First, we train a triplet network to learn vector representations of WCE image patches. Second, we cluster the patch embeddings to group patches in terms of visual similarity. Third, we use the cluster assignments as pseudolabels to train a patch classifier and use the Out-of-Distribution Detector for Neural Networks (ODIN) for OOD detection. The system has been tested on the Kvasir-capsule, a publicly released WCE dataset. Empirical results show an OOD detection improvement compared to baseline methods. Our method can detect unseen pathologies and anomalies such as lymphangiectasia, foreign bodies and blood with > 0.6. This work presents an effective solution for OOD detection models without needing labeled images
Deep Networks Based Energy Models for Object Recognition from Multimodality Images
Object recognition has been extensively investigated in computer vision area, since it is a fundamental and essential technique in many important applications, such as robotics, auto-driving, automated manufacturing, and security surveillance. According to the selection criteria, object recognition mechanisms can be broadly categorized into object proposal and classification, eye fixation prediction and saliency object detection. Object proposal tends to capture all potential objects from natural images, and then classify them into predefined groups for image description and interpretation. For a given natural image, human perception is normally attracted to the most visually important regions/objects. Therefore, eye fixation prediction attempts to localize some interesting points or small regions according to human visual system (HVS). Based on these interesting points and small regions, saliency object detection algorithms propagate the important extracted information to achieve a refined segmentation of the whole salient objects. In addition to natural images, object recognition also plays a critical role in clinical practice. The informative insights of anatomy and function of human body obtained from multimodality biomedical images such as magnetic resonance imaging (MRI), transrectal ultrasound (TRUS), computed tomography (CT) and positron emission tomography (PET) facilitate the precision medicine. Automated object recognition from biomedical images empowers the non-invasive diagnosis and treatments via automated tissue segmentation, tumor detection and cancer staging. The conventional recognition methods normally utilize handcrafted features (such as oriented gradients, curvature, Haar features, Haralick texture features, Laws energy features, etc.) depending on the image modalities and object characteristics. It is challenging to have a general model for object recognition. Superior to handcrafted features, deep neural networks (DNN) can extract self-adaptive features corresponding with specific task, hence can be employed for general object recognition models. These DNN-features are adjusted semantically and cognitively by over tens of millions parameters corresponding to the mechanism of human brain, therefore leads to more accurate and robust results. Motivated by it, in this thesis, we proposed DNN-based energy models to recognize object on multimodality images. For the aim of object recognition, the major contributions of this thesis can be summarized below: 1. We firstly proposed a new comprehensive autoencoder model to recognize the position and shape of prostate from magnetic resonance images. Different from the most autoencoder-based methods, we focused on positive samples to train the model in which the extracted features all come from prostate. After that, an image energy minimization scheme was applied to further improve the recognition accuracy. The proposed model was compared with three classic classifiers (i.e. support vector machine with radial basis function kernel, random forest, and naive Bayes), and demonstrated significant superiority for prostate recognition on magnetic resonance images. We further extended the proposed autoencoder model for saliency object detection on natural images, and the experimental validation proved the accurate and robust saliency object detection results of our model. 2. A general multi-contexts combined deep neural networks (MCDN) model was then proposed for object recognition from natural images and biomedical images. Under one uniform framework, our model was performed in multi-scale manner. Our model was applied for saliency object detection from natural images as well as prostate recognition from magnetic resonance images. Our experimental validation demonstrated that the proposed model was competitive to current state-of-the-art methods. 3. We designed a novel saliency image energy to finely segment salient objects on basis of our MCDN model. The region priors were taken into account in the energy function to avoid trivial errors. Our method outperformed state-of-the-art algorithms on five benchmarking datasets. In the experiments, we also demonstrated that our proposed saliency image energy can boost the results of other conventional saliency detection methods
AFP-Net: Realtime Anchor-Free Polyp Detection in Colonoscopy
Colorectal cancer (CRC) is a common and lethal disease. Globally, CRC is the
third most commonly diagnosed cancer in males and the second in females. For
colorectal cancer, the best screening test available is the colonoscopy. During
a colonoscopic procedure, a tiny camera at the tip of the endoscope generates a
video of the internal mucosa of the colon. The video data are displayed on a
monitor for the physician to examine the lining of the entire colon and check
for colorectal polyps. Detection and removal of colorectal polyps are
associated with a reduction in mortality from colorectal cancer. However, the
miss rate of polyp detection during colonoscopy procedure is often high even
for very experienced physicians. The reason lies in the high variation of polyp
in terms of shape, size, textural, color and illumination. Though challenging,
with the great advances in object detection techniques, automated polyp
detection still demonstrates a great potential in reducing the false negative
rate while maintaining a high precision. In this paper, we propose a novel
anchor free polyp detector that can localize polyps without using predefined
anchor boxes. To further strengthen the model, we leverage a Context
Enhancement Module and Cosine Ground truth Projection. Our approach can respond
in real time while achieving state-of-the-art performance with 99.36% precision
and 96.44% recall
Multi-pathology detection and lesion localization in WCE videos by using the instance segmentation approach
The majority of current systems for automatic diagnosis considers the detection of a unique and previously known pathology. Considering specifically the diagnosis of lesions in the small bowel using endoscopic capsule images, very few consider the possible existence of more than one pathology and when they do, they are mainly detection based systems therefore unable to localize the suspected lesions. Such systems do not fully satisfy the medical community, that in fact needs a system that detects any pathology and eventually more than one, when they coexist. In addition, besides the diagnostic capability of these systems, localizing the lesions in the image has been of great interest to the medical community, mainly for training medical personnel purposes. So, nowadays, the inclusion of the lesion location in automatic diagnostic systems is practically mandatory. Multi-pathology detection can be seen as a multi-object detection task and as each frame can contain different instances of the same lesion, instance segmentation seems to be appropriate for the purpose. Consequently, we argue that a multi-pathology system benefits from using the instance segmentation approach, since classification and segmentation modules are both required complementing each other in lesion detection and localization. According to our best knowledge such a system does not yet exist for the detection of WCE pathologies. This paper proposes a multi-pathology system that can be applied to WCE images, which uses the Mask Improved RCNN (MI-RCNN), a new mask subnet scheme which has shown to significantly improve mask predictions of the high performing state-of-the-art Mask-RCNN and PANet systems. A novel training strategy based on the second momentum is also proposed for the first time for training Mask-RCNN and PANet based systems. These approaches were tested using the public database KID, and the included pathologies were bleeding, angioectasias, polyps and inflammatory lesions. Experimental results show significant improvements for the prFCT national funds, under the national support to R&D
units grant, through the reference project UIDB/04436/2020 and UIDP/04436/2020 and
through the PhD Grants with the references SFRH/BD/92143/2013 and
SFRH/BD/139061/201
A Survey on Deep Learning in Medical Image Analysis
Deep learning algorithms, in particular convolutional networks, have rapidly
become a methodology of choice for analyzing medical images. This paper reviews
the major deep learning concepts pertinent to medical image analysis and
summarizes over 300 contributions to the field, most of which appeared in the
last year. We survey the use of deep learning for image classification, object
detection, segmentation, registration, and other tasks and provide concise
overviews of studies per application area. Open challenges and directions for
future research are discussed.Comment: Revised survey includes expanded discussion section and reworked
introductory section on common deep architectures. Added missed papers from
before Feb 1st 201
Assessing generalisability of deep learning-based polyp detection and segmentation methods through a computer vision challenge
Polyps are well-known cancer precursors identified by colonoscopy. However, variability in their size, appearance, and location makes the detection of polyps challenging. Moreover, colonoscopy surveillance and removal of polyps are highly operator-dependent procedures and occur in a highly complex organ topology. There exists a high missed detection rate and incomplete removal of colonic polyps. To assist in clinical procedures and reduce missed rates, automated methods for detecting and segmenting polyps using machine learning have been achieved in past years. However, the major drawback in most of these methods is their ability to generalise to out-of-sample unseen datasets from different centres, populations, modalities, and acquisition systems. To test this hypothesis rigorously, we, together with expert gastroenterologists, curated a multi-centre and multi-population dataset acquired from six different colonoscopy systems and challenged the computational expert teams to develop robust automated detection and segmentation methods in a crowd-sourcing Endoscopic computer vision challenge. This work put forward rigorous generalisability tests and assesses the usability of devised deep learning methods in dynamic and actual clinical colonoscopy procedures. We analyse the results of four top performing teams for the detection task and five top performing teams for the segmentation task. Our analyses demonstrate that the top-ranking teams concentrated mainly on accuracy over the real-time performance required for clinical applicability. We further dissect the devised methods and provide an experiment-based hypothesis that reveals the need for improved generalisability to tackle diversity present in multi-centre datasets and routine clinical procedures
Attention Mechanisms in Medical Image Segmentation: A Survey
Medical image segmentation plays an important role in computer-aided
diagnosis. Attention mechanisms that distinguish important parts from
irrelevant parts have been widely used in medical image segmentation tasks.
This paper systematically reviews the basic principles of attention mechanisms
and their applications in medical image segmentation. First, we review the
basic concepts of attention mechanism and formulation. Second, we surveyed over
300 articles related to medical image segmentation, and divided them into two
groups based on their attention mechanisms, non-Transformer attention and
Transformer attention. In each group, we deeply analyze the attention
mechanisms from three aspects based on the current literature work, i.e., the
principle of the mechanism (what to use), implementation methods (how to use),
and application tasks (where to use). We also thoroughly analyzed the
advantages and limitations of their applications to different tasks. Finally,
we summarize the current state of research and shortcomings in the field, and
discuss the potential challenges in the future, including task specificity,
robustness, standard evaluation, etc. We hope that this review can showcase the
overall research context of traditional and Transformer attention methods,
provide a clear reference for subsequent research, and inspire more advanced
attention research, not only in medical image segmentation, but also in other
image analysis scenarios.Comment: Submitted to Medical Image Analysis, survey paper, 34 pages, over 300
reference
Deep learning for detection and segmentation of artefact and disease instances in gastrointestinal endoscopy
The Endoscopy Computer Vision Challenge (EndoCV) is a crowd-sourcing initiative to address eminent problems in developing
reliable computer aided detection and diagnosis endoscopy systems and suggest a pathway for clinical translation
of technologies. Whilst endoscopy is a widely used diagnostic and treatment tool for hollow-organs, there are several core
challenges often faced by endoscopists, mainly: 1) presence of multi-class artefacts that hinder their visual interpretation, and
2) difficulty in identifying subtle precancerous precursors and cancer abnormalities. Artefacts often affect the robustness of
deep learning methods applied to the gastrointestinal tract organs as they can be confused with tissue of interest. EndoCV2020
challenges are designed to address research questions in these remits. In this paper, we present a summary of methods
developed by the top 17 teams and provide an objective comparison of state-of-the-art methods and methods designed by
the participants for two sub-challenges: i) artefact detection and segmentation (EAD2020), and ii) disease detection and
segmentation (EDD2020). Multi-center, multi-organ, multi-class, and multi-modal clinical endoscopy datasets were compiled
for both EAD2020 and EDD2020 sub-challenges. The out-of-sample generalization ability of detection algorithms was also
evaluated. Whilst most teams focused on accuracy improvements, only a few methods hold credibility for clinical usability. The
best performing teams provided solutions to tackle class imbalance, and variabilities in size, origin, modality and occurrences
by exploring data augmentation, data fusion, and optimal class thresholding techniques
- …