131 research outputs found
A Survey on Deep Learning for Polyp Segmentation: Techniques, Challenges and Future Trends
Early detection and assessment of polyps play a crucial role in the
prevention and treatment of colorectal cancer (CRC). Polyp segmentation
provides an effective solution to assist clinicians in accurately locating and
segmenting polyp regions. In the past, people often relied on manually
extracted lower-level features such as color, texture, and shape, which often
had issues capturing global context and lacked robustness to complex scenarios.
With the advent of deep learning, more and more outstanding medical image
segmentation algorithms based on deep learning networks have emerged, making
significant progress in this field. This paper provides a comprehensive review
of polyp segmentation algorithms. We first review some traditional algorithms
based on manually extracted features and deep segmentation algorithms, then
detail benchmark datasets related to the topic. Specifically, we carry out a
comprehensive evaluation of recent deep learning models and results based on
polyp sizes, considering the pain points of research topics and differences in
network structures. Finally, we discuss the challenges of polyp segmentation
and future trends in this field. The models, benchmark datasets, and source
code links we collected are all published at
https://github.com/taozh2017/Awesome-Polyp-Segmentation.Comment: 17 pages, 7 figure
Edge-aware Feature Aggregation Network for Polyp Segmentation
Precise polyp segmentation is vital for the early diagnosis and prevention of
colorectal cancer (CRC) in clinical practice. However, due to scale variation
and blurry polyp boundaries, it is still a challenging task to achieve
satisfactory segmentation performance with different scales and shapes. In this
study, we present a novel Edge-aware Feature Aggregation Network (EFA-Net) for
polyp segmentation, which can fully make use of cross-level and multi-scale
features to enhance the performance of polyp segmentation. Specifically, we
first present an Edge-aware Guidance Module (EGM) to combine the low-level
features with the high-level features to learn an edge-enhanced feature, which
is incorporated into each decoder unit using a layer-by-layer strategy.
Besides, a Scale-aware Convolution Module (SCM) is proposed to learn
scale-aware features by using dilated convolutions with different ratios, in
order to effectively deal with scale variation. Further, a Cross-level Fusion
Module (CFM) is proposed to effectively integrate the cross-level features,
which can exploit the local and global contextual information. Finally, the
outputs of CFMs are adaptively weighted by using the learned edge-aware
feature, which are then used to produce multiple side-out segmentation maps.
Experimental results on five widely adopted colonoscopy datasets show that our
EFA-Net outperforms state-of-the-art polyp segmentation methods in terms of
generalization and effectiveness.Comment: 20 pages 8 figure
PraNet: Parallel Reverse Attention Network for Polyp Segmentation
Colonoscopy is an effective technique for detecting colorectal polyps, which
are highly related to colorectal cancer. In clinical practice, segmenting
polyps from colonoscopy images is of great importance since it provides
valuable information for diagnosis and surgery. However, accurate polyp
segmentation is a challenging task, for two major reasons: (i) the same type of
polyps has a diversity of size, color and texture; and (ii) the boundary
between a polyp and its surrounding mucosa is not sharp. To address these
challenges, we propose a parallel reverse attention network (PraNet) for
accurate polyp segmentation in colonoscopy images. Specifically, we first
aggregate the features in high-level layers using a parallel partial decoder
(PPD). Based on the combined feature, we then generate a global map as the
initial guidance area for the following components. In addition, we mine the
boundary cues using a reverse attention (RA) module, which is able to establish
the relationship between areas and boundary cues. Thanks to the recurrent
cooperation mechanism between areas and boundaries, our PraNet is capable of
calibrating any misaligned predictions, improving the segmentation accuracy.
Quantitative and qualitative evaluations on five challenging datasets across
six metrics show that our PraNet improves the segmentation accuracy
significantly, and presents a number of advantages in terms of
generalizability, and real-time segmentation efficiency.Comment: Accepted to MICCAI 202
LAPFormer: A Light and Accurate Polyp Segmentation Transformer
Polyp segmentation is still known as a difficult problem due to the large
variety of polyp shapes, scanning and labeling modalities. This prevents deep
learning model to generalize well on unseen data. However, Transformer-based
approach recently has achieved some remarkable results on performance with the
ability of extracting global context better than CNN-based architecture and yet
lead to better generalization. To leverage this strength of Transformer, we
propose a new model with encoder-decoder architecture named LAPFormer, which
uses a hierarchical Transformer encoder to better extract global feature and
combine with our novel CNN (Convolutional Neural Network) decoder for capturing
local appearance of the polyps. Our proposed decoder contains a progressive
feature fusion module designed for fusing feature from upper scales and lower
scales and enable multi-scale features to be more correlative. Besides, we also
use feature refinement module and feature selection module for processing
feature. We test our model on five popular benchmark datasets for polyp
segmentation, including Kvasir, CVC-Clinic DB, CVC-ColonDB, CVC-T, and
ETIS-LaribComment: 7 pages, 7 figures, ACL 2023 underrevie
Segment Anything Model-guided Collaborative Learning Network for Scribble-supervised Polyp Segmentation
Polyp segmentation plays a vital role in accurately locating polyps at an
early stage, which holds significant clinical importance for the prevention of
colorectal cancer. Various polyp segmentation methods have been developed using
fully-supervised deep learning techniques. However, pixel-wise annotation for
polyp images by physicians during the diagnosis is both time-consuming and
expensive. Moreover, visual foundation models such as the Segment Anything
Model (SAM) have shown remarkable performance. Nevertheless, directly applying
SAM to medical segmentation may not produce satisfactory results due to the
inherent absence of medical knowledge. In this paper, we propose a novel
SAM-guided Collaborative Learning Network (SAM-CLNet) for scribble-supervised
polyp segmentation, enabling a collaborative learning process between our
segmentation network and SAM to boost the model performance. Specifically, we
first propose a Cross-level Enhancement and Aggregation Network (CEA-Net) for
weakly-supervised polyp segmentation. Within CEA-Net, we propose a Cross-level
Enhancement Module (CEM) that integrates the adjacent features to enhance the
representation capabilities of different resolution features. Additionally, a
Feature Aggregation Module (FAM) is employed to capture richer features across
multiple levels. Moreover, we present a box-augmentation strategy that combines
the segmentation maps generated by CEA-Net with scribble annotations to create
more precise prompts. These prompts are then fed into SAM, generating
segmentation SAM-guided masks, which can provide additional supervision to
train CEA-Net effectively. Furthermore, we present an Image-level Filtering
Mechanism to filter out unreliable SAM-guided masks. Extensive experimental
results show that our SAM-CLNet outperforms state-of-the-art weakly-supervised
segmentation methods.Comment: 10 pages, 7 figure
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers
Most polyp segmentation methods use CNNs as their backbone, leading to two
key issues when exchanging information between the encoder and decoder: 1)
taking into account the differences in contribution between different-level
features; and 2) designing an effective mechanism for fusing these features.
Different from existing CNN-based methods, we adopt a transformer encoder,
which learns more powerful and robust representations. In addition, considering
the image acquisition influence and elusive properties of polyps, we introduce
three novel modules, including a cascaded fusion module (CFM), a camouflage
identification module (CIM), and a similarity aggregation module (SAM). Among
these, the CFM is used to collect the semantic and location information of
polyps from high-level features, while the CIM is applied to capture polyp
information disguised in low-level features. With the help of the SAM, we
extend the pixel features of the polyp area with high-level semantic position
information to the entire polyp area, thereby effectively fusing cross-level
features. The proposed model, named Polyp-PVT, effectively suppresses noises in
the features and significantly improves their expressive capabilities.
Extensive experiments on five widely adopted datasets show that the proposed
model is more robust to various challenging situations (e.g., appearance
changes, small objects) than existing methods, and achieves the new
state-of-the-art performance. The proposed model is available at
https://github.com/DengPingFan/Polyp-PVT.Comment: Technical Repor
Multi-level feature fusion network combining attention mechanisms for polyp segmentation
Clinically, automated polyp segmentation techniques have the potential to
significantly improve the efficiency and accuracy of medical diagnosis, thereby
reducing the risk of colorectal cancer in patients. Unfortunately, existing
methods suffer from two significant weaknesses that can impact the accuracy of
segmentation. Firstly, features extracted by encoders are not adequately
filtered and utilized. Secondly, semantic conflicts and information redundancy
caused by feature fusion are not attended to. To overcome these limitations, we
propose a novel approach for polyp segmentation, named MLFF-Net, which
leverages multi-level feature fusion and attention mechanisms. Specifically,
MLFF-Net comprises three modules: Multi-scale Attention Module (MAM),
High-level Feature Enhancement Module (HFEM), and Global Attention Module
(GAM). Among these, MAM is used to extract multi-scale information and polyp
details from the shallow output of the encoder. In HFEM, the deep features of
the encoders complement each other by aggregation. Meanwhile, the attention
mechanism redistributes the weight of the aggregated features, weakening the
conflicting redundant parts and highlighting the information useful to the
task. GAM combines features from the encoder and decoder features, as well as
computes global dependencies to prevent receptive field locality. Experimental
results on five public datasets show that the proposed method not only can
segment multiple types of polyps but also has advantages over current
state-of-the-art methods in both accuracy and generalization ability
RaBiT: An Efficient Transformer using Bidirectional Feature Pyramid Network with Reverse Attention for Colon Polyp Segmentation
Automatic and accurate segmentation of colon polyps is essential for early
diagnosis of colorectal cancer. Advanced deep learning models have shown
promising results in polyp segmentation. However, they still have limitations
in representing multi-scale features and generalization capability. To address
these issues, this paper introduces RaBiT, an encoder-decoder model that
incorporates a lightweight Transformer-based architecture in the encoder to
model multiple-level global semantic relationships. The decoder consists of
several bidirectional feature pyramid layers with reverse attention modules to
better fuse feature maps at various levels and incrementally refine polyp
boundaries. We also propose ideas to lighten the reverse attention module and
make it more suitable for multi-class segmentation. Extensive experiments on
several benchmark datasets show that our method outperforms existing methods
across all datasets while maintaining low computational complexity. Moreover,
our method demonstrates high generalization capability in cross-dataset
experiments, even when the training and test sets have different
characteristics
Neural Network Pruning for Real-time Polyp Segmentation
Computer-assisted treatment has emerged as a viable application of medical
imaging, owing to the efficacy of deep learning models. Real-time inference
speed remains a key requirement for such applications to help medical
personnel. Even though there generally exists a trade-off between performance
and model size, impressive efforts have been made to retain near-original
performance by compromising model size. Neural network pruning has emerged as
an exciting area that aims to eliminate redundant parameters to make the
inference faster. In this study, we show an application of neural network
pruning in polyp segmentation. We compute the importance score of convolutional
filters and remove the filters having the least scores, which to some value of
pruning does not degrade the performance. For computing the importance score,
we use the Taylor First Order (TaylorFO) approximation of the change in network
output for the removal of certain filters. Specifically, we employ a
gradient-normalized backpropagation for the computation of the importance
score. Through experiments in the polyp datasets, we validate that our approach
can significantly reduce the parameter count and FLOPs retaining similar
performance
- …