130 research outputs found
Robust Perception through Equivariance
Deep networks for computer vision are not reliable when they encounter
adversarial examples. In this paper, we introduce a framework that uses the
dense intrinsic constraints in natural images to robustify inference. By
introducing constraints at inference time, we can shift the burden of
robustness from training to the inference algorithm, thereby allowing the model
to adjust dynamically to each individual image's unique and potentially novel
characteristics at inference time. Among different constraints, we find that
equivariance-based constraints are most effective, because they allow dense
constraints in the feature space without overly constraining the representation
at a fine-grained level. Our theoretical results validate the importance of
having such dense constraints at inference time. Our empirical experiments show
that restoring feature equivariance at inference time defends against
worst-case adversarial perturbations. The method obtains improved adversarial
robustness on four datasets (ImageNet, Cityscapes, PASCAL VOC, and MS-COCO) on
image recognition, semantic segmentation, and instance segmentation tasks.
Project page is available at equi4robust.cs.columbia.edu
Pareto Navigation Gradient Descent: a First-Order Algorithm for Optimization in Pareto Set
Many modern machine learning applications, such as multi-task learning,
require finding optimal model parameters to trade-off multiple objective
functions that may conflict with each other. The notion of the Pareto set
allows us to focus on the set of (often infinite number of) models that cannot
be strictly improved. But it does not provide an actionable procedure for
picking one or a few special models to return to practical users. In this
paper, we consider \emph{optimization in Pareto set (OPT-in-Pareto)}, the
problem of finding Pareto models that optimize an extra reference criterion
function within the Pareto set. This function can either encode a specific
preference from the users, or represent a generic diversity measure for
obtaining a set of diversified Pareto models that are representative of the
whole Pareto set. Unfortunately, despite being a highly useful framework,
efficient algorithms for OPT-in-Pareto have been largely missing, especially
for large-scale, non-convex, and non-linear objectives in deep learning. A
naive approach is to apply Riemannian manifold gradient descent on the Pareto
set, which yields a high computational cost due to the need for
eigen-calculation of Hessian matrices. We propose a first-order algorithm that
approximately solves OPT-in-Pareto using only gradient information, with both
high practical efficiency and theoretically guaranteed convergence property.
Empirically, we demonstrate that our method works efficiently for a variety of
challenging multi-task-related problems
Recommended from our members
Robust Machine Learning by Integrating Context
Intelligent software has the potential to transform our society. It is becoming the building block for many systems in the real world. However, despite the excellent performance of machine learning models on benchmarks, state-of-the-art methods like neural networks often fail once they encounter realistic settings. Since neural networks often learn correlations without reasoning with the right signals and knowledge, they fail when facing shifting distributions, unforeseen corruptions, and worst-case scenarios. Since neural networks are black-box models, they are not interpretable or trusted by the user. We need to build robust models for machine learning to be confidently and responsibly deployed in the most critical applications and systems.
In this dissertation, I introduce our robust machine learning systems advancements by tightly integrating context into algorithms. The context has two aspects: the intrinsic structure of natural data, and the extrinsic structure from domain knowledge. Both are crucial: By capitalizing on the intrinsic structure in natural data, my work has shown that we can create robust machine learning systems, even in the worst case, an analytical result that also enjoys strong empirical gains.
Through integrating external knowledge, such as the association between tasks and causal structure, my framework can instruct models to use the right signals for inference, enabling new opportunities for controllable and interpretable models.
This thesis consists of three parts. In the first part, I aim to cover three works that use the intrinsic structure as a constraint to achieve robust inference. I present our framework that performs test-time optimization to respect the natural constraint, which is captured by self-supervised tasks. I illustrate that test-time optimization improves out-of-distribution generalization and adversarial robustness. Besides the inference algorithm, I show that intrinsic structure through discrete representations also improves out-of-distribution robustness.
In the second part of the thesis, I then detail my work using external domain knowledge. I first introduce using causal structure from external domain knowledge to improve domain generalization robustness. I then show how the association of multiple tasks and regularization objectives helps robustness.
In the final part of this dissertation, I show three works on trustworthy and reliable foundation models, a general-purpose model that will be the foundation for many AI applications. I show a framework that uses context to secure, interpret, and control foundation models
Boosting Adversarial Attacks on Neural Networks with Better Optimizer
Convolutional neural networks have outperformed humans in image recognition
tasks, but they remain vulnerable to attacks from adversarial examples. Since
these data are crafted by adding imperceptible noise to normal images, their
existence poses potential security threats to deep learning systems.
Sophisticated adversarial examples with strong attack performance can also be
used as a tool to evaluate the robustness of a model. However, the success rate
of adversarial attacks can be further improved in black-box environments.
Therefore, this study combines a modified Adam gradient descent algorithm with
the iterative gradient-based attack method. The proposed Adam Iterative Fast
Gradient Method is then used to improve the transferability of adversarial
examples. Extensive experiments on ImageNet showed that the proposed method
offers a higher attack success rate than existing iterative methods. By
extending our method, we achieved a state-of-the-art attack success rate of
95.0% on defense models
Towards Improving Robustness Against Common Corruptions in Object Detectors Using Adversarial Contrastive Learning
Neural networks have revolutionized various domains, exhibiting remarkable
accuracy in tasks like natural language processing and computer vision.
However, their vulnerability to slight alterations in input samples poses
challenges, particularly in safety-critical applications like autonomous
driving. Current approaches, such as introducing distortions during training,
fall short in addressing unforeseen corruptions. This paper proposes an
innovative adversarial contrastive learning framework to enhance neural network
robustness simultaneously against adversarial attacks and common corruptions.
By generating instance-wise adversarial examples and optimizing contrastive
loss, our method fosters representations that resist adversarial perturbations
and remain robust in real-world scenarios. Subsequent contrastive learning then
strengthens the similarity between clean samples and their adversarial
counterparts, fostering representations resistant to both adversarial attacks
and common distortions. By focusing on improving performance under adversarial
and real-world conditions, our approach aims to bolster the robustness of
neural networks in safety-critical applications, such as autonomous vehicles
navigating unpredictable weather conditions. We anticipate that this framework
will contribute to advancing the reliability of neural networks in challenging
environments, facilitating their widespread adoption in mission-critical
scenarios
- …