Search CORE

3,657 research outputs found

Camouflaged Image Synthesis Is All You Need to Boost Camouflaged Detection

Author: Fu Yun
Qin Can
Yin Yu
Zhang Haichao
Publication venue
Publication date: 13/08/2023
Field of study

Camouflaged objects that blend into natural scenes pose significant challenges for deep-learning models to detect and synthesize. While camouflaged object detection is a crucial task in computer vision with diverse real-world applications, this research topic has been constrained by limited data availability. We propose a framework for synthesizing camouflage data to enhance the detection of camouflaged objects in natural scenes. Our approach employs a generative model to produce realistic camouflage images, which can be used to train existing object detection models. Specifically, we use a camouflage environment generator supervised by a camouflage distribution classifier to synthesize the camouflage images, which are then fed into our generator to expand the dataset. Our framework outperforms the current state-of-the-art method on three datasets (COD10k, CAMO, and CHAMELEON), demonstrating its effectiveness in improving camouflaged object detection. This approach can serve as a plug-and-play data generation and augmentation module for existing camouflaged object detection tasks and provides a novel way to introduce more diversity and distributions into current camouflage datasets

arXiv.org e-Print Archive

Referring Camouflaged Object Detection

Author: Cheng Ming-Ming
Fan Deng-Ping
Hou Qibin
Lin Zheng
Yin Bowen
Zhang Xuying
Publication venue
Publication date: 13/06/2023
Field of study

In this paper, we consider the problem of referring camouflaged object detection (Ref-COD), a new task that aims to segment specified camouflaged objects based on some form of reference, e.g., image, text. We first assemble a large-scale dataset, called R2C7K, which consists of 7K images covering 64 object categories in real-world scenarios. Then, we develop a simple but strong dual-branch framework, dubbed R2CNet, with a reference branch learning common representations from the referring information and a segmentation branch identifying and segmenting camouflaged objects under the guidance of the common representations. In particular, we design a Referring Mask Generation module to generate pixel-level prior mask and a Referring Feature Enrichment module to enhance the capability of identifying camouflaged objects. Extensive experiments show the superiority of our Ref-COD methods over their COD counterparts in segmenting specified camouflaged objects and identifying the main body of target objects. Our code and dataset are publicly available at https://github.com/zhangxuying1004/RefCOD

arXiv.org e-Print Archive

Survey of Object Detection Methods in Camouflaged Image

Author: Dhawale C. A.
Misra Sanjay
Singh S. K.
Publication venue
Publication date: 01/01/2013
Field of study

Camouflage is an attempt to conceal the signature of a target object into the background image. Camouflage detection methods or Decamouflaging method is basically used to detect foreground object hidden in the background image. In this research paper authors presented survey of camouflage detection methods for different applications and areas

Covenant University Repository

Elsevier - Publisher Connector

Transformer Transforms Salient Object Detection and Camouflaged Object Detection

Author: Barnes Nick
Dai Yuchao
Fan Deng-Ping
Li Aixuan
Lv Yunqiu
Mao Yuxin
Tian Xinyu
Wan Zhexiong
Zhang Jing
Publication venue
Publication date: 26/06/2021
Field of study

The transformer networks are particularly good at modeling long-range dependencies within a long sequence. In this paper, we conduct research on applying the transformer networks for salient object detection (SOD). We adopt the dense transformer backbone for fully supervised RGB image based SOD, RGB-D image pair based SOD, and weakly supervised SOD within a unified framework based on the observation that the transformer backbone can provide accurate structure modeling, which makes it powerful in learning from weak labels with less structure information. Further, we find that the vision transformer architectures do not offer direct spatial supervision, instead encoding position as a feature. Therefore, we investigate the contributions of two strategies to provide stronger spatial supervision through the transformer layers within our unified framework, namely deep supervision and difficulty-aware learning. We find that deep supervision can get gradients back into the higher level features, thus leads to uniform activation within the same semantic object. Difficulty-aware learning on the other hand is capable of identifying the hard pixels for effective hard negative mining. We also visualize features of conventional backbone and transformer backbone before and after fine-tuning them for SOD, and find that transformer backbone encodes more accurate object structure information and more distinct semantic information within the lower and higher level features respectively. We also apply our model to camouflaged object detection (COD) and achieve similar observations as the above three SOD tasks. Extensive experimental results on various SOD and COD tasks illustrate that transformer networks can transform SOD and COD, leading to new benchmarks for each related task. The source code and experimental results are available via our project page: https://github.com/fupiao1998/TrasformerSOD.Comment: Technical report, 18 pages, 22 figure

arXiv.org e-Print Archive

A Unified Query-based Paradigm for Camouflaged Instance Segmentation

Author: Dong Do
Gao Rongrong
Pei Jialun
Wang Shuo
Xiang Tian-Zhu
Xiong Huan
Publication venue
Publication date: 14/08/2023
Field of study

Due to the high similarity between camouflaged instances and the background, the recently proposed camouflaged instance segmentation (CIS) faces challenges in accurate localization and instance segmentation. To this end, inspired by query-based transformers, we propose a unified query-based multi-task learning framework for camouflaged instance segmentation, termed UQFormer, which builds a set of mask queries and a set of boundary queries to learn a shared composed query representation and efficiently integrates global camouflaged object region and boundary cues, for simultaneous instance segmentation and instance boundary detection in camouflaged scenarios. Specifically, we design a composed query learning paradigm that learns a shared representation to capture object region and boundary features by the cross-attention interaction of mask queries and boundary queries in the designed multi-scale unified learning transformer decoder. Then, we present a transformer-based multi-task learning framework for simultaneous camouflaged instance segmentation and camouflaged instance boundary detection based on the learned composed query representation, which also forces the model to learn a strong instance-level query representation. Notably, our model views the instance segmentation as a query-based direct set prediction problem, without other post-processing such as non-maximal suppression. Compared with 14 state-of-the-art approaches, our UQFormer significantly improves the performance of camouflaged instance segmentation. Our code will be available at https://github.com/dongbo811/UQFormer.Comment: This paper has been accepted by ACM MM202

arXiv.org e-Print Archive

Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers

Author: Chen Huai-Xin
Dai Hang
Huang Zhou
Qin Jie
Wang Shuo
Xiang Tian-Zhu
Xiong Huan
Publication venue
Publication date: 26/03/2023
Field of study

Vision transformers have recently shown strong global context modeling capabilities in camouflaged object detection. However, they suffer from two major limitations: less effective locality modeling and insufficient feature aggregation in decoders, which are not conducive to camouflaged object detection that explores subtle cues from indistinguishable backgrounds. To address these issues, in this paper, we propose a novel transformer-based Feature Shrinkage Pyramid Network (FSPNet), which aims to hierarchically decode locality-enhanced neighboring transformer features through progressive shrinking for camouflaged object detection. Specifically, we propose a nonlocal token enhancement module (NL-TEM) that employs the non-local mechanism to interact neighboring tokens and explore graph-based high-order relations within tokens to enhance local representations of transformers. Moreover, we design a feature shrinkage decoder (FSD) with adjacent interaction modules (AIM), which progressively aggregates adjacent transformer features through a layer-bylayer shrinkage pyramid to accumulate imperceptible but effective cues as much as possible for object information decoding. Extensive quantitative and qualitative experiments demonstrate that the proposed model significantly outperforms the existing 24 competitors on three challenging COD benchmark datasets under six widely-used evaluation metrics. Our code is publicly available at https://github.com/ZhouHuang23/FSPNet.Comment: CVPR 2023. Project webpage at: https://tzxiang.github.io/project/COD-FSPNet/index.htm

arXiv.org e-Print Archive

Enlighten

Frequency Perception Network for Camouflaged Object Detection

Author: Cong Runmin
Sun Mengyao
Zhang Sanyi
Zhang Wei
Zhao Yao
Zhou Xiaofei
Publication venue
Publication date: 17/08/2023
Field of study

Camouflaged object detection (COD) aims to accurately detect objects hidden in the surrounding environment. However, the existing COD methods mainly locate camouflaged objects in the RGB domain, their performance has not been fully exploited in many challenging scenarios. Considering that the features of the camouflaged object and the background are more discriminative in the frequency domain, we propose a novel learnable and separable frequency perception mechanism driven by the semantic hierarchy in the frequency domain. Our entire network adopts a two-stage model, including a frequency-guided coarse localization stage and a detail-preserving fine localization stage. With the multi-level features extracted by the backbone, we design a flexible frequency perception module based on octave convolution for coarse positioning. Then, we design the correction fusion module to step-by-step integrate the high-level features through the prior-guided correction and cross-layer feature channel association, and finally combine them with the shallow features to achieve the detailed correction of the camouflaged objects. Compared with the currently existing models, our proposed method achieves competitive performance in three popular benchmark datasets both qualitatively and quantitatively.Comment: Accepted by ACM MM 202

arXiv.org e-Print Archive