50 research outputs found
Progressive Feature Self-reinforcement for Weakly Supervised Semantic Segmentation
Compared to conventional semantic segmentation with pixel-level supervision,
Weakly Supervised Semantic Segmentation (WSSS) with image-level labels poses
the challenge that it always focuses on the most discriminative regions,
resulting in a disparity between fully supervised conditions. A typical
manifestation is the diminished precision on the object boundaries, leading to
a deteriorated accuracy of WSSS. To alleviate this issue, we propose to
adaptively partition the image content into deterministic regions (e.g.,
confident foreground and background) and uncertain regions (e.g., object
boundaries and misclassified categories) for separate processing. For uncertain
cues, we employ an activation-based masking strategy and seek to recover the
local information with self-distilled knowledge. We further assume that the
unmasked confident regions should be robust enough to preserve the global
semantics. Building upon this, we introduce a complementary self-enhancement
method that constrains the semantic consistency between these confident regions
and an augmented image with the same class labels. Extensive experiments
conducted on PASCAL VOC 2012 and MS COCO 2014 demonstrate that our proposed
single-stage approach for WSSS not only outperforms state-of-the-art benchmarks
remarkably but also surpasses multi-stage methodologies that trade complexity
for accuracy. The code can be found at
\url{https://github.com/Jessie459/feature-self-reinforcement}.Comment: Accepted by AAAI 202
WMFormer++: Nested Transformer for Visible Watermark Removal via Implict Joint Learning
Watermarking serves as a widely adopted approach to safeguard media
copyright. In parallel, the research focus has extended to watermark removal
techniques, offering an adversarial means to enhance watermark robustness and
foster advancements in the watermarking field. Existing watermark removal
methods mainly rely on UNet with task-specific decoder branches--one for
watermark localization and the other for background image restoration. However,
watermark localization and background restoration are not isolated tasks;
precise watermark localization inherently implies regions necessitating
restoration, and the background restoration process contributes to more
accurate watermark localization. To holistically integrate information from
both branches, we introduce an implicit joint learning paradigm. This empowers
the network to autonomously navigate the flow of information between implicit
branches through a gate mechanism. Furthermore, we employ cross-channel
attention to facilitate local detail restoration and holistic structural
comprehension, while harnessing nested structures to integrate multi-scale
information. Extensive experiments are conducted on various challenging
benchmarks to validate the effectiveness of our proposed method. The results
demonstrate our approach's remarkable superiority, surpassing existing
state-of-the-art methods by a large margin
Cross-Modality High-Frequency Transformer for MR Image Super-Resolution
Improving the resolution of magnetic resonance (MR) image data is critical to
computer-aided diagnosis and brain function analysis. Higher resolution helps
to capture more detailed content, but typically induces to lower
signal-to-noise ratio and longer scanning time. To this end, MR image
super-resolution has become a widely-interested topic in recent times. Existing
works establish extensive deep models with the conventional architectures based
on convolutional neural networks (CNN). In this work, to further advance this
research field, we make an early effort to build a Transformer-based MR image
super-resolution framework, with careful designs on exploring valuable domain
prior knowledge. Specifically, we consider two-fold domain priors including the
high-frequency structure prior and the inter-modality context prior, and
establish a novel Transformer architecture, called Cross-modality
high-frequency Transformer (Cohf-T), to introduce such priors into
super-resolving the low-resolution (LR) MR images. Comprehensive experiments on
two datasets indicate that Cohf-T achieves new state-of-the-art performance