316 research outputs found
10 Security and Privacy Problems in Self-Supervised Learning
Self-supervised learning has achieved revolutionary progress in the past
several years and is commonly believed to be a promising approach for
general-purpose AI. In particular, self-supervised learning aims to pre-train
an encoder using a large amount of unlabeled data. The pre-trained encoder is
like an "operating system" of the AI ecosystem. Specifically, the encoder can
be used as a feature extractor for many downstream tasks with little or no
labeled training data. Existing studies on self-supervised learning mainly
focused on pre-training a better encoder to improve its performance on
downstream tasks in non-adversarial settings, leaving its security and privacy
in adversarial settings largely unexplored. A security or privacy issue of a
pre-trained encoder leads to a single point of failure for the AI ecosystem. In
this book chapter, we discuss 10 basic security and privacy problems for the
pre-trained encoders in self-supervised learning, including six confidentiality
problems, three integrity problems, and one availability problem. For each
problem, we discuss potential opportunities and challenges. We hope our book
chapter will inspire future research on the security and privacy of
self-supervised learning.Comment: A book chapte
Optimal covariance matrix estimation for high-dimensional noise in high-frequency data
In this paper, we consider efficiently learning the structural information
from the highdimensional noise in high-frequency data via estimating its
covariance matrix with optimality. The problem is uniquely challenging due to
the latency of the targeted high-dimensional vector containing the noises, and
the practical reality that the observed data can be highly asynchronous -- not
all components of the high-dimensional vector are observed at the same time
points. To meet the challenges, we propose a new covariance matrix estimator
with appropriate localization and thresholding. In the setting with latency and
asynchronous observations, we establish the minimax optimal convergence rates
associated with two commonly used loss functions for the covariance matrix
estimations. As a major theoretical development, we show that despite the
latency of the signal in the high-frequency data, the optimal rates remain the
same as if the targeted high-dimensional noises are directly observable. Our
results indicate that the optimal rates reflect the impact due to the
asynchronous observations, which are slower than that with synchronous
observations. Furthermore, we demonstrate that the proposed localized estimator
with thresholding achieves the minimax optimal convergence rates. We also
illustrate the empirical performance of the proposed estimator with extensive
simulation studies and a real data analysis
StolenEncoder: Stealing Pre-trained Encoders in Self-supervised Learning
Pre-trained encoders are general-purpose feature extractors that can be used
for many downstream tasks. Recent progress in self-supervised learning can
pre-train highly effective encoders using a large volume of unlabeled data,
leading to the emerging encoder as a service (EaaS). A pre-trained encoder may
be deemed confidential because its training requires lots of data and
computation resources as well as its public release may facilitate misuse of
AI, e.g., for deepfakes generation. In this paper, we propose the first attack
called StolenEncoder to steal pre-trained image encoders. We evaluate
StolenEncoder on multiple target encoders pre-trained by ourselves and three
real-world target encoders including the ImageNet encoder pre-trained by
Google, CLIP encoder pre-trained by OpenAI, and Clarifai's General Embedding
encoder deployed as a paid EaaS. Our results show that our stolen encoders have
similar functionality with the target encoders. In particular, the downstream
classifiers built upon a target encoder and a stolen one have similar accuracy.
Moreover, stealing a target encoder using StolenEncoder requires much less data
and computation resources than pre-training it from scratch. We also explore
three defenses that perturb feature vectors produced by a target encoder. Our
results show these defenses are not enough to mitigate StolenEncoder.Comment: To appear in ACM Conference on Computer and Communications Security
(CCS), 202
Breaking Modality Disparity: Harmonized Representation for Infrared and Visible Image Registration
Since the differences in viewing range, resolution and relative position, the
multi-modality sensing module composed of infrared and visible cameras needs to
be registered so as to have more accurate scene perception. In practice, manual
calibration-based registration is the most widely used process, and it is
regularly calibrated to maintain accuracy, which is time-consuming and
labor-intensive. To cope with these problems, we propose a scene-adaptive
infrared and visible image registration. Specifically, in regard of the
discrepancy between multi-modality images, an invertible translation process is
developed to establish a modality-invariant domain, which comprehensively
embraces the feature intensity and distribution of both infrared and visible
modalities. We employ homography to simulate the deformation between different
planes and develop a hierarchical framework to rectify the deformation inferred
from the proposed latent representation in a coarse-to-fine manner. For that,
the advanced perception ability coupled with the residual estimation conducive
to the regression of sparse offsets, and the alternate correlation search
facilitates a more accurate correspondence matching. Moreover, we propose the
first ground truth available misaligned infrared and visible image dataset,
involving three synthetic sets and one real-world set. Extensive experiments
validate the effectiveness of the proposed method against the
state-of-the-arts, advancing the subsequent applications.Comment: 10 pages, 11 figure
AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training
Monocular 3D object detection plays a pivotal role in the field of autonomous
driving and numerous deep learning-based methods have made significant
breakthroughs in this area. Despite the advancements in detection accuracy and
efficiency, these models tend to fail when faced with such attacks, rendering
them ineffective. Therefore, bolstering the adversarial robustness of 3D
detection models has become a crucial issue that demands immediate attention
and innovative solutions. To mitigate this issue, we propose a depth-aware
robust adversarial training method for monocular 3D object detection, dubbed
DART3D. Specifically, we first design an adversarial attack that iteratively
degrades the 2D and 3D perception capabilities of 3D object detection
models(IDP), serves as the foundation for our subsequent defense mechanism. In
response to this attack, we propose an uncertainty-based residual learning
method for adversarial training. Our adversarial training approach capitalizes
on the inherent uncertainty, enabling the model to significantly improve its
robustness against adversarial attacks. We conducted extensive experiments on
the KITTI 3D datasets, demonstrating that DART3D surpasses direct adversarial
training (the most popular approach) under attacks in 3D object detection
of car category for the Easy, Moderate, and Hard settings, with
improvements of 4.415%, 4.112%, and 3.195%, respectively
Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception
Due to the uneven scattering and absorption of different light wavelengths in
aquatic environments, underwater images suffer from low visibility and clear
color deviations. With the advancement of autonomous underwater vehicles,
extensive research has been conducted on learning-based underwater enhancement
algorithms. These works can generate visually pleasing enhanced images and
mitigate the adverse effects of degraded images on subsequent perception tasks.
However, learning-based methods are susceptible to the inherent fragility of
adversarial attacks, causing significant disruption in results. In this work,
we introduce a collaborative adversarial resilience network, dubbed CARNet, for
underwater image enhancement and subsequent detection tasks. Concretely, we
first introduce an invertible network with strong perturbation-perceptual
abilities to isolate attacks from underwater images, preventing interference
with image enhancement and perceptual tasks. Furthermore, we propose a
synchronized attack training strategy with both visual-driven and
perception-driven attacks enabling the network to discern and remove various
types of attacks. Additionally, we incorporate an attack pattern discriminator
to heighten the robustness of the network against different attacks. Extensive
experiments demonstrate that the proposed method outputs visually appealing
enhancement images and perform averagely 6.71% higher detection mAP than
state-of-the-art methods.Comment: 9 pages, 9 figure
Enhancing Infrared Small Target Detection Robustness with Bi-Level Adversarial Framework
The detection of small infrared targets against blurred and cluttered
backgrounds has remained an enduring challenge. In recent years, learning-based
schemes have become the mainstream methodology to establish the mapping
directly. However, these methods are susceptible to the inherent complexities
of changing backgrounds and real-world disturbances, leading to unreliable and
compromised target estimations. In this work, we propose a bi-level adversarial
framework to promote the robustness of detection in the presence of distinct
corruptions. We first propose a bi-level optimization formulation to introduce
dynamic adversarial learning. Specifically, it is composited by the learnable
generation of corruptions to maximize the losses as the lower-level objective
and the robustness promotion of detectors as the upper-level one. We also
provide a hierarchical reinforced learning strategy to discover the most
detrimental corruptions and balance the performance between robustness and
accuracy. To better disentangle the corruptions from salient features, we also
propose a spatial-frequency interaction network for target detection. Extensive
experiments demonstrate our scheme remarkably improves 21.96% IOU across a wide
array of corruptions and notably promotes 4.97% IOU on the general benchmark.
The source codes are available at https://github.com/LiuZhu-CV/BALISTD.Comment: 9 pages, 6 figure
Certified Robustness of Nearest Neighbors against Data Poisoning and Backdoor Attacks
Data poisoning attacks and backdoor attacks aim to corrupt a machine learning
classifier via modifying, adding, and/or removing some carefully selected
training examples, such that the corrupted classifier makes incorrect
predictions as the attacker desires. The key idea of state-of-the-art certified
defenses against data poisoning attacks and backdoor attacks is to create a
majority vote mechanism to predict the label of a testing example. Moreover,
each voter is a base classifier trained on a subset of the training dataset.
Classical simple learning algorithms such as k nearest neighbors (kNN) and
radius nearest neighbors (rNN) have intrinsic majority vote mechanisms. In this
work, we show that the intrinsic majority vote mechanisms in kNN and rNN
already provide certified robustness guarantees against data poisoning attacks
and backdoor attacks. Moreover, our evaluation results on MNIST and CIFAR10
show that the intrinsic certified robustness guarantees of kNN and rNN
outperform those provided by state-of-the-art certified defenses. Our results
serve as standard baselines for future certified defenses against data
poisoning attacks and backdoor attacks.Comment: To appear in AAAI Conference on Artificial Intelligence, 202
- …