50 research outputs found
Universal Adaptive Data Augmentation
Existing automatic data augmentation (DA) methods either ignore updating DA's
parameters according to the target model's state during training or adopt
update strategies that are not effective enough. In this work, we design a
novel data augmentation strategy called "Universal Adaptive Data Augmentation"
(UADA). Different from existing methods, UADA would adaptively update DA's
parameters according to the target model's gradient information during
training: given a pre-defined set of DA operations, we randomly decide types
and magnitudes of DA operations for every data batch during training, and
adaptively update DA's parameters along the gradient direction of the loss
concerning DA's parameters. In this way, UADA can increase the training loss of
the target networks, and the target networks would learn features from harder
samples to improve the generalization. Moreover, UADA is very general and can
be utilized in numerous tasks, e.g., image classification, semantic
segmentation and object detection. Extensive experiments with various models
are conducted on CIFAR-10, CIFAR-100, ImageNet, tiny-ImageNet, Cityscapes, and
VOC07+12 to prove the significant performance improvements brought by our
proposed adaptive augmentation.Comment: under submissio
Self-supervised Learning for Enhancing Geometrical Modeling in 3D-Aware Generative Adversarial Network
3D-aware Generative Adversarial Networks (3D-GANs) currently exhibit
artifacts in their 3D geometrical modeling, such as mesh imperfections and
holes. These shortcomings are primarily attributed to the limited availability
of annotated 3D data, leading to a constrained "valid latent area" for
satisfactory modeling. To address this, we present a Self-Supervised Learning
(SSL) technique tailored as an auxiliary loss for any 3D-GAN, designed to
improve its 3D geometrical modeling capabilities. Our approach pioneers an
inversion technique for 3D-GANs, integrating an encoder that performs adaptive
spatially-varying range operations. Utilizing this inversion, we introduce the
Cyclic Generative Constraint (CGC), aiming to densify the valid latent space.
The CGC operates via augmented local latent vectors that maintain the same
geometric form, and it imposes constraints on the cycle path outputs,
specifically the generator-encoder-generator sequence. This SSL methodology
seamlessly integrates with the inherent GAN loss, ensuring the integrity of
pre-existing 3D-GAN architectures without necessitating alterations. We
validate our approach with comprehensive experiments across various datasets
and architectures, underscoring its efficacy. Our project website:
https://3dgan-ssl.github.ioComment: 13 pages, 12 figures, 6 table
General Adversarial Defense Against Black-box Attacks via Pixel Level and Feature Level Distribution Alignments
Deep Neural Networks (DNNs) are vulnerable to the black-box adversarial
attack that is highly transferable. This threat comes from the distribution gap
between adversarial and clean samples in feature space of the target DNNs. In
this paper, we use Deep Generative Networks (DGNs) with a novel training
mechanism to eliminate the distribution gap. The trained DGNs align the
distribution of adversarial samples with clean ones for the target DNNs by
translating pixel values. Different from previous work, we propose a more
effective pixel level training constraint to make this achievable, thus
enhancing robustness on adversarial samples. Further, a class-aware
feature-level constraint is formulated for integrated distribution alignment.
Our approach is general and applicable to multiple tasks, including image
classification, semantic segmentation, and object detection. We conduct
extensive experiments on different datasets. Our strategy demonstrates its
unique effectiveness and generality against black-box attacks
InsightMapper: A Closer Look at Inner-instance Information for Vectorized High-Definition Mapping
Vectorized high-definition (HD) maps contain detailed information about
surrounding road elements, which are crucial for various downstream tasks in
modern autonomous driving vehicles, such as vehicle planning and control.
Recent works have attempted to directly detect the vectorized HD map as a point
set prediction task, resulting in significant improvements in detection
performance. However, these approaches fail to analyze and exploit the
inner-instance correlations between predicted points, impeding further
advancements. To address these challenges, we investigate the utilization of
inner-tance information for vectorized h-definition
mapping through ransformers and introduce InsightMapper. This paper
presents three novel designs within InsightMapper that leverage inner-instance
information in distinct ways, including hybrid query generation, inner-instance
query fusion, and inner-instance feature aggregation. Comparative experiments
are conducted on the NuScenes dataset, showcasing the superiority of our
proposed method. InsightMapper surpasses previous state-of-the-art (SOTA)
methods by 5.78 mAP and 5.12 TOPO, which assess topology correctness.
Simultaneously, InsightMapper maintains high efficiency during both training
and inference phases, resulting in remarkable comprehensive performance. The
project page for this work is available at
https://tonyxuqaq.github.io/projects/InsightMapper .Comment: Code and demo will be available at
https://tonyxuqaq.github.io/projects/InsightMappe
Influencer Backdoor Attack on Semantic Segmentation
When a small number of poisoned samples are injected into the training
dataset of a deep neural network, the network can be induced to exhibit
malicious behavior during inferences, which poses potential threats to
real-world applications. While they have been intensively studied in
classification, backdoor attacks on semantic segmentation have been largely
overlooked. Unlike classification, semantic segmentation aims to classify every
pixel within a given image. In this work, we explore backdoor attacks on
segmentation models to misclassify all pixels of a victim class by injecting a
specific trigger on non-victim pixels during inferences, which is dubbed
Influencer Backdoor Attack (IBA). IBA is expected to maintain the
classification accuracy of non-victim pixels and misleads classifications of
all victim pixels in every single inference. Specifically, we consider two
types of IBA scenarios, i.e., 1) Free-position IBA: the trigger can be
positioned freely except for pixels of the victim class, and 2) Long-distance
IBA: the trigger can only be positioned somewhere far from victim pixels, given
the possible practical constraint. Based on the context aggregation ability of
segmentation models, we propose techniques to improve IBA for the scenarios.
Concretely, for free-position IBA, we propose a simple, yet effective Nearest
Neighbor trigger injection strategy for poisoned sample creation. For
long-distance IBA, we propose a novel Pixel Random Labeling strategy. Our
extensive experiments reveal that current segmentation models do suffer from
backdoor attacks, and verify that our proposed techniques can further increase
attack performance
Memory Consistency Guided Divide-and-Conquer Learning for Generalized Category Discovery
Generalized category discovery (GCD) aims at addressing a more realistic and
challenging setting of semi-supervised learning, where only part of the
category labels are assigned to certain training samples. Previous methods
generally employ naive contrastive learning or unsupervised clustering scheme
for all the samples. Nevertheless, they usually ignore the inherent critical
information within the historical predictions of the model being trained.
Specifically, we empirically reveal that a significant number of salient
unlabeled samples yield consistent historical predictions corresponding to
their ground truth category. From this observation, we propose a Memory
Consistency guided Divide-and-conquer Learning framework (MCDL). In this
framework, we introduce two memory banks to record historical prediction of
unlabeled data, which are exploited to measure the credibility of each sample
in terms of its prediction consistency. With the guidance of credibility, we
can design a divide-and-conquer learning strategy to fully utilize the
discriminative information of unlabeled data while alleviating the negative
influence of noisy labels. Extensive experimental results on multiple
benchmarks demonstrate the generality and superiority of our method, where our
method outperforms state-of-the-art models by a large margin on both seen and
unseen classes of the generic image recognition and challenging semantic shift
settings (i.e.,with +8.4% gain on CUB and +8.1% on Standford Cars)
Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning
Semi-supervised learning is attracting blooming attention, due to its success
in combining unlabeled data. To mitigate potentially incorrect pseudo labels,
recent frameworks mostly set a fixed confidence threshold to discard uncertain
samples. This practice ensures high-quality pseudo labels, but incurs a
relatively low utilization of the whole unlabeled set. In this work, our key
insight is that these uncertain samples can be turned into certain ones, as
long as the confusion classes for the top-1 class are detected and removed.
Invoked by this, we propose a novel method dubbed ShrinkMatch to learn
uncertain samples. For each uncertain sample, it adaptively seeks a shrunk
class space, which merely contains the original top-1 class, as well as
remaining less likely classes. Since the confusion ones are removed in this
space, the re-calculated top-1 confidence can satisfy the pre-defined
threshold. We then impose a consistency regularization between a pair of
strongly and weakly augmented samples in the shrunk space to strive for
discriminative representations. Furthermore, considering the varied reliability
among uncertain samples and the gradually improved model during training, we
correspondingly design two reweighting principles for our uncertain loss. Our
method exhibits impressive performance on widely adopted benchmarks. Code is
available at https://github.com/LiheYoung/ShrinkMatch.Comment: Accepted by ICCV 202