3,657 research outputs found
Camouflaged Image Synthesis Is All You Need to Boost Camouflaged Detection
Camouflaged objects that blend into natural scenes pose significant
challenges for deep-learning models to detect and synthesize. While camouflaged
object detection is a crucial task in computer vision with diverse real-world
applications, this research topic has been constrained by limited data
availability. We propose a framework for synthesizing camouflage data to
enhance the detection of camouflaged objects in natural scenes. Our approach
employs a generative model to produce realistic camouflage images, which can be
used to train existing object detection models. Specifically, we use a
camouflage environment generator supervised by a camouflage distribution
classifier to synthesize the camouflage images, which are then fed into our
generator to expand the dataset. Our framework outperforms the current
state-of-the-art method on three datasets (COD10k, CAMO, and CHAMELEON),
demonstrating its effectiveness in improving camouflaged object detection. This
approach can serve as a plug-and-play data generation and augmentation module
for existing camouflaged object detection tasks and provides a novel way to
introduce more diversity and distributions into current camouflage datasets
Referring Camouflaged Object Detection
In this paper, we consider the problem of referring camouflaged object
detection (Ref-COD), a new task that aims to segment specified camouflaged
objects based on some form of reference, e.g., image, text. We first assemble a
large-scale dataset, called R2C7K, which consists of 7K images covering 64
object categories in real-world scenarios. Then, we develop a simple but strong
dual-branch framework, dubbed R2CNet, with a reference branch learning common
representations from the referring information and a segmentation branch
identifying and segmenting camouflaged objects under the guidance of the common
representations. In particular, we design a Referring Mask Generation module to
generate pixel-level prior mask and a Referring Feature Enrichment module to
enhance the capability of identifying camouflaged objects. Extensive
experiments show the superiority of our Ref-COD methods over their COD
counterparts in segmenting specified camouflaged objects and identifying the
main body of target objects. Our code and dataset are publicly available at
https://github.com/zhangxuying1004/RefCOD
Survey of Object Detection Methods in Camouflaged Image
Camouflage is an attempt to conceal the signature of a target object into the background image. Camouflage detection
methods or Decamouflaging method is basically used to detect foreground object hidden in the background image. In this
research paper authors presented survey of camouflage detection methods for different applications and areas
Transformer Transforms Salient Object Detection and Camouflaged Object Detection
The transformer networks are particularly good at modeling long-range
dependencies within a long sequence. In this paper, we conduct research on
applying the transformer networks for salient object detection (SOD). We adopt
the dense transformer backbone for fully supervised RGB image based SOD, RGB-D
image pair based SOD, and weakly supervised SOD within a unified framework
based on the observation that the transformer backbone can provide accurate
structure modeling, which makes it powerful in learning from weak labels with
less structure information. Further, we find that the vision transformer
architectures do not offer direct spatial supervision, instead encoding
position as a feature. Therefore, we investigate the contributions of two
strategies to provide stronger spatial supervision through the transformer
layers within our unified framework, namely deep supervision and
difficulty-aware learning. We find that deep supervision can get gradients back
into the higher level features, thus leads to uniform activation within the
same semantic object. Difficulty-aware learning on the other hand is capable of
identifying the hard pixels for effective hard negative mining. We also
visualize features of conventional backbone and transformer backbone before and
after fine-tuning them for SOD, and find that transformer backbone encodes more
accurate object structure information and more distinct semantic information
within the lower and higher level features respectively. We also apply our
model to camouflaged object detection (COD) and achieve similar observations as
the above three SOD tasks. Extensive experimental results on various SOD and
COD tasks illustrate that transformer networks can transform SOD and COD,
leading to new benchmarks for each related task. The source code and
experimental results are available via our project page:
https://github.com/fupiao1998/TrasformerSOD.Comment: Technical report, 18 pages, 22 figure
A Unified Query-based Paradigm for Camouflaged Instance Segmentation
Due to the high similarity between camouflaged instances and the background,
the recently proposed camouflaged instance segmentation (CIS) faces challenges
in accurate localization and instance segmentation. To this end, inspired by
query-based transformers, we propose a unified query-based multi-task learning
framework for camouflaged instance segmentation, termed UQFormer, which builds
a set of mask queries and a set of boundary queries to learn a shared composed
query representation and efficiently integrates global camouflaged object
region and boundary cues, for simultaneous instance segmentation and instance
boundary detection in camouflaged scenarios. Specifically, we design a composed
query learning paradigm that learns a shared representation to capture object
region and boundary features by the cross-attention interaction of mask queries
and boundary queries in the designed multi-scale unified learning transformer
decoder. Then, we present a transformer-based multi-task learning framework for
simultaneous camouflaged instance segmentation and camouflaged instance
boundary detection based on the learned composed query representation, which
also forces the model to learn a strong instance-level query representation.
Notably, our model views the instance segmentation as a query-based direct set
prediction problem, without other post-processing such as non-maximal
suppression. Compared with 14 state-of-the-art approaches, our UQFormer
significantly improves the performance of camouflaged instance segmentation.
Our code will be available at https://github.com/dongbo811/UQFormer.Comment: This paper has been accepted by ACM MM202
Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers
Vision transformers have recently shown strong global context modeling
capabilities in camouflaged object detection. However, they suffer from two
major limitations: less effective locality modeling and insufficient feature
aggregation in decoders, which are not conducive to camouflaged object
detection that explores subtle cues from indistinguishable backgrounds. To
address these issues, in this paper, we propose a novel transformer-based
Feature Shrinkage Pyramid Network (FSPNet), which aims to hierarchically decode
locality-enhanced neighboring transformer features through progressive
shrinking for camouflaged object detection. Specifically, we propose a nonlocal
token enhancement module (NL-TEM) that employs the non-local mechanism to
interact neighboring tokens and explore graph-based high-order relations within
tokens to enhance local representations of transformers. Moreover, we design a
feature shrinkage decoder (FSD) with adjacent interaction modules (AIM), which
progressively aggregates adjacent transformer features through a layer-bylayer
shrinkage pyramid to accumulate imperceptible but effective cues as much as
possible for object information decoding. Extensive quantitative and
qualitative experiments demonstrate that the proposed model significantly
outperforms the existing 24 competitors on three challenging COD benchmark
datasets under six widely-used evaluation metrics. Our code is publicly
available at https://github.com/ZhouHuang23/FSPNet.Comment: CVPR 2023. Project webpage at:
https://tzxiang.github.io/project/COD-FSPNet/index.htm
Frequency Perception Network for Camouflaged Object Detection
Camouflaged object detection (COD) aims to accurately detect objects hidden
in the surrounding environment. However, the existing COD methods mainly locate
camouflaged objects in the RGB domain, their performance has not been fully
exploited in many challenging scenarios. Considering that the features of the
camouflaged object and the background are more discriminative in the frequency
domain, we propose a novel learnable and separable frequency perception
mechanism driven by the semantic hierarchy in the frequency domain. Our entire
network adopts a two-stage model, including a frequency-guided coarse
localization stage and a detail-preserving fine localization stage. With the
multi-level features extracted by the backbone, we design a flexible frequency
perception module based on octave convolution for coarse positioning. Then, we
design the correction fusion module to step-by-step integrate the high-level
features through the prior-guided correction and cross-layer feature channel
association, and finally combine them with the shallow features to achieve the
detailed correction of the camouflaged objects. Compared with the currently
existing models, our proposed method achieves competitive performance in three
popular benchmark datasets both qualitatively and quantitatively.Comment: Accepted by ACM MM 202
- …