4,956 research outputs found
On the Effectiveness of Defensive Distillation
We report experimental results indicating that defensive distillation
successfully mitigates adversarial samples crafted using the fast gradient sign
method, in addition to those crafted using the Jacobian-based iterative attack
on which the defense mechanism was originally evaluated.Comment: Technical Repor
Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks
Deep learning algorithms have been shown to perform extremely well on many
classical machine learning problems. However, recent studies have shown that
deep learning, like other machine learning techniques, is vulnerable to
adversarial samples: inputs crafted to force a deep neural network (DNN) to
provide adversary-selected outputs. Such attacks can seriously undermine the
security of the system supported by the DNN, sometimes with devastating
consequences. For example, autonomous vehicles can be crashed, illicit or
illegal content can bypass content filters, or biometric authentication systems
can be manipulated to allow improper access. In this work, we introduce a
defensive mechanism called defensive distillation to reduce the effectiveness
of adversarial samples on DNNs. We analytically investigate the
generalizability and robustness properties granted by the use of defensive
distillation when training DNNs. We also empirically study the effectiveness of
our defense mechanisms on two DNNs placed in adversarial settings. The study
shows that defensive distillation can reduce effectiveness of sample creation
from 95% to less than 0.5% on a studied DNN. Such dramatic gains can be
explained by the fact that distillation leads gradients used in adversarial
sample creation to be reduced by a factor of 10^30. We also find that
distillation increases the average minimum number of features that need to be
modified to create adversarial samples by about 800% on one of the DNNs we
tested
Building Robust Deep Neural Networks for Road Sign Detection
Deep Neural Networks are built to generalize outside of training set in mind
by using techniques such as regularization, early stopping and dropout. But
considerations to make them more resilient to adversarial examples are rarely
taken. As deep neural networks become more prevalent in mission-critical and
real-time systems, miscreants start to attack them by intentionally making deep
neural networks to misclassify an object of one type to be seen as another
type. This can be catastrophic in some scenarios where the classification of a
deep neural network can lead to a fatal decision by a machine. In this work, we
used GTSRB dataset to craft adversarial samples by Fast Gradient Sign Method
and Jacobian Saliency Method, used those crafted adversarial samples to attack
another Deep Convolutional Neural Network and built the attacked network to be
more resilient against adversarial attacks by making it more robust by
Defensive Distillation and Adversarial Trainin
Enhanced Attacks on Defensively Distilled Deep Neural Networks
Deep neural networks (DNNs) have achieved tremendous success in many tasks of
machine learning, such as the image classification. Unfortunately, researchers
have shown that DNNs are easily attacked by adversarial examples, slightly
perturbed images which can mislead DNNs to give incorrect classification
results. Such attack has seriously hampered the deployment of DNN systems in
areas where security or safety requirements are strict, such as autonomous
cars, face recognition, malware detection. Defensive distillation is a
mechanism aimed at training a robust DNN which significantly reduces the
effectiveness of adversarial examples generation. However, the state-of-the-art
attack can be successful on distilled networks with 100% probability. But it is
a white-box attack which needs to know the inner information of DNN. Whereas,
the black-box scenario is more general. In this paper, we first propose the
epsilon-neighborhood attack, which can fool the defensively distilled networks
with 100% success rate in the white-box setting, and it is fast to generate
adversarial examples with good visual quality. On the basis of this attack, we
further propose the region-based attack against defensively distilled DNNs in
the black-box setting. And we also perform the bypass attack to indirectly
break the distillation defense as a complementary method. The experimental
results show that our black-box attacks have a considerable success rate on
defensively distilled networks
A Survey on Resilient Machine Learning
Machine learning based system are increasingly being used for sensitive tasks
such as security surveillance, guiding autonomous vehicle, taking investment
decisions, detecting and blocking network intrusion and malware etc. However,
recent research has shown that machine learning models are venerable to attacks
by adversaries at all phases of machine learning (eg, training data collection,
training, operation). All model classes of machine learning systems can be
misled by providing carefully crafted inputs making them wrongly classify
inputs. Maliciously created input samples can affect the learning process of a
ML system by either slowing down the learning process, or affecting the
performance of the learned mode, or causing the system make error(s) only in
attacker's planned scenario. Because of these developments, understanding
security of machine learning algorithms and systems is emerging as an important
research area among computer security and machine learning researchers and
practitioners. We present a survey of this emerging area in machine learning
Adversarial Examples: Opportunities and Challenges
Deep neural networks (DNNs) have shown huge superiority over humans in image
recognition, speech processing, autonomous vehicles and medical diagnosis.
However, recent studies indicate that DNNs are vulnerable to adversarial
examples (AEs), which are designed by attackers to fool deep learning models.
Different from real examples, AEs can mislead the model to predict incorrect
outputs while hardly be distinguished by human eyes, therefore threaten
security-critical deep-learning applications. In recent years, the generation
and defense of AEs have become a research hotspot in the field of artificial
intelligence (AI) security. This article reviews the latest research progress
of AEs. First, we introduce the concept, cause, characteristics and evaluation
metrics of AEs, then give a survey on the state-of-the-art AE generation
methods with the discussion of advantages and disadvantages. After that, we
review the existing defenses and discuss their limitations. Finally, future
research opportunities and challenges on AEs are prospected.Comment: 16 pages, 13 figures, 5 table
Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples
Image compression-based approaches for defending against the
adversarial-example attacks, which threaten the safety use of deep neural
networks (DNN), have been investigated recently. However, prior works mainly
rely on directly tuning parameters like compression rate, to blindly reduce
image features, thereby lacking guarantee on both defense efficiency (i.e.
accuracy of polluted images) and classification accuracy of benign images,
after applying defense methods. To overcome these limitations, we propose a
JPEG-based defensive compression framework, namely "feature distillation", to
effectively rectify adversarial examples without impacting classification
accuracy on benign data. Our framework significantly escalates the defense
efficiency with marginal accuracy reduction using a two-step method: First, we
maximize malicious features filtering of adversarial input perturbations by
developing defensive quantization in frequency domain of JPEG compression or
decompression, guided by a semi-analytical method; Second, we suppress the
distortions of benign features to restore classification accuracy through a
DNN-oriented quantization refine process. Our experimental results show that
proposed "feature distillation" can significantly surpass the latest
input-transformation based mitigations such as Quilting and TV Minimization in
three aspects, including defense efficiency (improve classification accuracy
from to on adversarial examples), accuracy of benign
images after defense ( accuracy degradation), and processing time per
image ( Speedup). Moreover, our solution can also provide the
best defense efficiency ( accuracy) against the recent adaptive
attack with least accuracy reduction () on benign images when compared
with other input-transformation based defense methods.Comment: 2019 Conference on Computer Vision and Pattern Recognition (CVPR
2019
Extending Defensive Distillation
Machine learning is vulnerable to adversarial examples: inputs carefully
modified to force misclassification. Designing defenses against such inputs
remains largely an open problem. In this work, we revisit defensive
distillation---which is one of the mechanisms proposed to mitigate adversarial
examples---to address its limitations. We view our results not only as an
effective way of addressing some of the recently discovered attacks but also as
reinforcing the importance of improved training techniques
A Black-box Attack on Neural Networks Based on Swarm Evolutionary Algorithm
Neural networks play an increasingly important role in the field of machine
learning and are included in many applications in society. Unfortunately,
neural networks suffer from adversarial samples generated to attack them.
However, most of the generation approaches either assume that the attacker has
full knowledge of the neural network model or are limited by the type of
attacked model. In this paper, we propose a new approach that generates a
black-box attack to neural networks based on the swarm evolutionary algorithm.
Benefiting from the improvements in the technology and theoretical
characteristics of evolutionary algorithms, our approach has the advantages of
effectiveness, black-box attack, generality, and randomness. Our experimental
results show that both the MNIST images and the CIFAR-10 images can be
perturbed to successful generate a black-box attack with 100\% probability on
average. In addition, the proposed attack, which is successful on distilled
neural networks with almost 100\% probability, is resistant to defensive
distillation. The experimental results also indicate that the robustness of the
artificial intelligence algorithm is related to the complexity of the model and
the data set. In addition, we find that the adversarial samples to some extent
reproduce the characteristics of the sample data learned by the neural network
model
Attack and Defense of Dynamic Analysis-Based, Adversarial Neural Malware Classification Models
Recently researchers have proposed using deep learning-based systems for
malware detection. Unfortunately, all deep learning classification systems are
vulnerable to adversarial attacks. Previous work has studied adversarial
attacks against static analysis-based malware classifiers which only classify
the content of the unknown file without execution. However, since the majority
of malware is either packed or encrypted, malware classification based on
static analysis often fails to detect these types of files. To overcome this
limitation, anti-malware companies typically perform dynamic analysis by
emulating each file in the anti-malware engine or performing in-depth scanning
in a virtual machine. These strategies allow the analysis of the malware after
unpacking or decryption. In this work, we study different strategies of
crafting adversarial samples for dynamic analysis. These strategies operate on
sparse, binary inputs in contrast to continuous inputs such as pixels in
images. We then study the effects of two, previously proposed defensive
mechanisms against crafted adversarial samples including the distillation and
ensemble defenses. We also propose and evaluate the weight decay defense.
Experiments show that with these three defensive strategies, the number of
successfully crafted adversarial samples is reduced compared to a standard
baseline system without any defenses. In particular, the ensemble defense is
the most resilient to adversarial attacks. Importantly, none of the defenses
significantly reduce the classification accuracy for detecting malware.
Finally, we demonstrate that while adding additional hidden layers to neural
models does not significantly improve the malware classification accuracy, it
does significantly increase the classifier's robustness to adversarial attacks
- …