309 research outputs found
Motivating the Rules of the Game for Adversarial Example Research
Advances in machine learning have led to broad deployment of systems with
impressive performance on important problems. Nonetheless, these systems can be
induced to make errors on data that are surprisingly similar to examples the
learned system handles correctly. The existence of these errors raises a
variety of questions about out-of-sample generalization and whether bad actors
might use such examples to abuse deployed systems. As a result of these
security concerns, there has been a flurry of recent papers proposing
algorithms to defend against such malicious perturbations of correctly handled
examples. It is unclear how such misclassifications represent a different kind
of security problem than other errors, or even other attacker-produced examples
that have no specific relationship to an uncorrupted input. In this paper, we
argue that adversarial example defense papers have, to date, mostly considered
abstract, toy games that do not relate to any specific security concern.
Furthermore, defense papers have not yet precisely described all the abilities
and limitations of attackers that would be relevant in practical security.
Towards this end, we establish a taxonomy of motivations, constraints, and
abilities for more plausible adversaries. Finally, we provide a series of
recommendations outlining a path forward for future work to more clearly
articulate the threat model and perform more meaningful evaluation
Adversarial Attacks against Face Recognition: A Comprehensive Study
Face recognition (FR) systems have demonstrated outstanding verification
performance, suggesting suitability for real-world applications ranging from
photo tagging in social media to automated border control (ABC). In an advanced
FR system with deep learning-based architecture, however, promoting the
recognition efficiency alone is not sufficient, and the system should also
withstand potential kinds of attacks designed to target its proficiency. Recent
studies show that (deep) FR systems exhibit an intriguing vulnerability to
imperceptible or perceptible but natural-looking adversarial input images that
drive the model to incorrect output predictions. In this article, we present a
comprehensive survey on adversarial attacks against FR systems and elaborate on
the competence of new countermeasures against them. Further, we propose a
taxonomy of existing attack and defense methods based on different criteria. We
compare attack methods on the orientation and attributes and defense approaches
on the category. Finally, we explore the challenges and potential research
direction
Explainable Black-Box Attacks Against Model-based Authentication
Establishing unique identities for both humans and end systems has been an
active research problem in the security community, giving rise to innovative
machine learning-based authentication techniques. Although such techniques
offer an automated method to establish identity, they have not been vetted
against sophisticated attacks that target their core machine learning
technique. This paper demonstrates that mimicking the unique signatures
generated by host fingerprinting and biometric authentication systems is
possible. We expose the ineffectiveness of underlying machine learning
classification models by constructing a blind attack based around the query
synthesis framework and utilizing Explainable-AI (XAI) techniques. We launch an
attack in under 130 queries on a state-of-the-art face authentication system,
and under 100 queries on a host authentication system. We examine how these
attacks can be defended against and explore their limitations. XAI provides an
effective means for adversaries to infer decision boundaries and provides a new
way forward in constructing attacks against systems using machine learning
models for authentication
Adversarial Examples: Attacks and Defenses for Deep Learning
With rapid progress and significant successes in a wide spectrum of
applications, deep learning is being applied in many safety-critical
environments. However, deep neural networks have been recently found vulnerable
to well-designed input samples, called adversarial examples. Adversarial
examples are imperceptible to human but can easily fool deep neural networks in
the testing/deploying stage. The vulnerability to adversarial examples becomes
one of the major risks for applying deep neural networks in safety-critical
environments. Therefore, attacks and defenses on adversarial examples draw
great attention. In this paper, we review recent findings on adversarial
examples for deep neural networks, summarize the methods for generating
adversarial examples, and propose a taxonomy of these methods. Under the
taxonomy, applications for adversarial examples are investigated. We further
elaborate on countermeasures for adversarial examples and explore the
challenges and the potential solutions.Comment: Github: https://github.com/chbrian/awesome-adversarial-examples-d
A Survey on Resilient Machine Learning
Machine learning based system are increasingly being used for sensitive tasks
such as security surveillance, guiding autonomous vehicle, taking investment
decisions, detecting and blocking network intrusion and malware etc. However,
recent research has shown that machine learning models are venerable to attacks
by adversaries at all phases of machine learning (eg, training data collection,
training, operation). All model classes of machine learning systems can be
misled by providing carefully crafted inputs making them wrongly classify
inputs. Maliciously created input samples can affect the learning process of a
ML system by either slowing down the learning process, or affecting the
performance of the learned mode, or causing the system make error(s) only in
attacker's planned scenario. Because of these developments, understanding
security of machine learning algorithms and systems is emerging as an important
research area among computer security and machine learning researchers and
practitioners. We present a survey of this emerging area in machine learning
Defending against substitute model black box adversarial attacks with the 01 loss
Substitute model black box attacks can create adversarial examples for a
target model just by accessing its output labels. This poses a major challenge
to machine learning models in practice, particularly in security sensitive
applications. The 01 loss model is known to be more robust to outliers and
noise than convex models that are typically used in practice. Motivated by
these properties we present 01 loss linear and 01 loss dual layer neural
network models as a defense against transfer based substitute model black box
attacks. We compare the accuracy of adversarial examples from substitute model
black box attacks targeting our 01 loss models and their convex counterparts
for binary classification on popular image benchmarks. Our 01 loss dual layer
neural network has an adversarial accuracy of 66.2%, 58%, 60.5%, and 57% on
MNIST, CIFAR10, STL10, and ImageNet respectively whereas the sigmoid activated
logistic loss counterpart has accuracies of 63.5%, 19.3%, 14.9%, and 27.6%.
Except for MNIST the convex counterparts have substantially lower adversarial
accuracies. We show practical applications of our models to deter traffic sign
and facial recognition adversarial attacks. On GTSRB street sign and CelebA
facial detection our 01 loss network has 34.6% and 37.1% adversarial accuracy
respectively whereas the convex logistic counterpart has accuracy 24% and 1.9%.
Finally we show that our 01 loss network can attain robustness on par with
simple convolutional neural networks and much higher than its convex
counterpart even when attacked with a convolutional network substitute model.
Our work shows that 01 loss models offer a powerful defense against substitute
model black box attacks.Comment: arXiv admin note: substantial text overlap with arXiv:2006.07800;
text overlap with arXiv:2008.0914
A General Framework for Adversarial Examples with Objectives
Images perturbed subtly to be misclassified by neural networks, called
adversarial examples, have emerged as a technically deep challenge and an
important concern for several application domains. Most research on adversarial
examples takes as its only constraint that the perturbed images are similar to
the originals. However, real-world application of these ideas often requires
the examples to satisfy additional objectives, which are typically enforced
through custom modifications of the perturbation process. In this paper, we
propose adversarial generative nets (AGNs), a general methodology to train a
generator neural network to emit adversarial examples satisfying desired
objectives. We demonstrate the ability of AGNs to accommodate a wide range of
objectives, including imprecise ones difficult to model, in two application
domains. In particular, we demonstrate physical adversarial examples---eyeglass
frames designed to fool face recognition---with better robustness,
inconspicuousness, and scalability than previous approaches, as well as a new
attack to fool a handwritten-digit classifier.Comment: Accepted for publication at ACM TOP
Defending Model Inversion and Membership Inference Attacks via Prediction Purification
Neural networks are susceptible to data inference attacks such as the model
inversion attack and the membership inference attack, where the attacker could
infer the reconstruction and the membership of a data sample from the
confidence scores predicted by the target classifier. In this paper, we propose
a unified approach, namely purification framework, to defend data inference
attacks. It purifies the confidence score vectors predicted by the target
classifier by reducing their dispersion. The purifier can be further
specialized in defending a particular attack via adversarial learning. We
evaluate our approach on benchmark datasets and classifiers. We show that when
the purifier is dedicated to one attack, it naturally defends the other one,
which empirically demonstrates the connection between the two attacks. The
purifier can effectively defend both attacks. For example, it can reduce the
membership inference accuracy by up to 15% and increase the model inversion
error by a factor of up to 4. Besides, it incurs less than 0.4% classification
accuracy drop and less than 5.5% distortion to the confidence scores.Comment: updated experiments and result
Accurate and Robust Neural Networks for Security Related Applications Exampled by Face Morphing Attacks
Artificial neural networks tend to learn only what they need for a task. A
manipulation of the training data can counter this phenomenon. In this paper,
we study the effect of different alterations of the training data, which limit
the amount and position of information that is available for the decision
making. We analyze the accuracy and robustness against semantic and black box
attacks on the networks that were trained on different training data
modifications for the particular example of morphing attacks. A morphing attack
is an attack on a biometric facial recognition system where the system is
fooled to match two different individuals with the same synthetic face image.
Such a synthetic image can be created by aligning and blending images of the
two individuals that should be matched with this image.Comment: 16 pages, 7 figure
Adversarial Attacks and Defences: A Survey
Deep learning has emerged as a strong and efficient framework that can be
applied to a broad spectrum of complex learning problems which were difficult
to solve using the traditional machine learning techniques in the past. In the
last few years, deep learning has advanced radically in such a way that it can
surpass human-level performance on a number of tasks. As a consequence, deep
learning is being extensively used in most of the recent day-to-day
applications. However, security of deep learning systems are vulnerable to
crafted adversarial examples, which may be imperceptible to the human eye, but
can lead the model to misclassify the output. In recent times, different types
of adversaries based on their threat model leverage these vulnerabilities to
compromise a deep learning system where adversaries have high incentives.
Hence, it is extremely important to provide robustness to deep learning
algorithms against these adversaries. However, there are only a few strong
countermeasures which can be used in all types of attack scenarios to design a
robust deep learning system. In this paper, we attempt to provide a detailed
discussion on different types of adversarial attacks with various threat models
and also elaborate the efficiency and challenges of recent countermeasures
against them
- …