4 research outputs found
AdvGAN++ : Harnessing latent layers for adversary generation
Adversarial examples are fabricated examples, indistinguishable from the original image that mislead neural networks and drastically lower their performance. Recently proposed AdvGAN, a GAN based approach, takes input image as a prior for generating adversaries to target a model. In this work, we show how latent features can serve as better priors than input images for adversary generation by proposing AdvGAN++, a version of AdvGAN that achieves higher attack rates than AdvGAN and at the same time generates perceptually realistic images on MNIST and CIFAR10 datasets
Adversarial Camouflage for Node Injection Attack on Graphs
Node injection attacks against Graph Neural Networks (GNNs) have received
emerging attention as a practical attack scenario, where the attacker injects
malicious nodes instead of modifying node features or edges to degrade the
performance of GNNs. Despite the initial success of node injection attacks, we
find that the injected nodes by existing methods are easy to be distinguished
from the original normal nodes by defense methods and limiting their attack
performance in practice. To solve the above issues, we devote to camouflage
node injection attack, i.e., camouflaging injected malicious nodes
(structure/attributes) as the normal ones that appear legitimate/imperceptible
to defense methods. The non-Euclidean nature of graph data and the lack of
human prior brings great challenges to the formalization, implementation, and
evaluation of camouflage on graphs. In this paper, we first propose and
formulate the camouflage of injected nodes from both the fidelity and diversity
of the ego networks centered around injected nodes. Then, we design an
adversarial CAmouflage framework for Node injection Attack, namely CANA, to
improve the camouflage while ensuring the attack performance. Several novel
indicators for graph camouflage are further designed for a comprehensive
evaluation. Experimental results demonstrate that when equipping existing node
injection attack methods with our proposed CANA framework, the attack
performance against defense methods as well as node camouflage is significantly
improved
Towards Robust Deep Neural Networks
Deep neural networks (DNNs) enable state-of-the-art performance for most machine
learning tasks. Unfortunately, they are vulnerable to attacks, such as Trojans during
training and Adversarial Examples at test time. Adversarial Examples are inputs
with carefully crafted perturbations added to benign samples. In the Computer
Vision domain, while the perturbations being imperceptible to humans, Adversarial
Examples can successfully misguide or fool DNNs. Meanwhile, Trojan or backdoor
attacks involve attackers tampering with the training process, for example, to inject
poisoned training data to embed a backdoor into the network that can be activated
during model deployment when the Trojan triggers (known only to the attackers)
appear in the model’s inputs. This dissertation investigates methods of building robust
DNNs against these training-time and test-time threats.
Recognising the threat of Adversarial Examples in the malware domain, this research
considers the problem of realising a robust DNN-based malware detector against Adversarial
Example attacks by developing a Bayesian adversarial learning algorithm. In contrast
to vision tasks, adversarial learning in a domain without a differentiable or invertible
mapping function from the problemspace (such as software code inputs) to the feature
space is hard. The study proposes an alternative; performing adversarial learning in
the feature space and proving the projection of perturbed yet, valid malware, in the
problem space into the feature space will be a subset of feature-space adversarial
attacks. The Bayesian approach improves benign performance, provably bounds
the difference between adversarial risk and empirical risk and improves robustness
against increasingly large attack budgets not employed during training.
To investigate the problem of improving the robustness of DNNs against Adversarial
Examples–carefully crafted perturbation added to inputs—in the Computer Vision
domain, the research considers the problem of developing a Bayesian learning algorithm to
realise a robust DNN against Adversarial Examples in the CV domain. Accordingly, a novel
Bayesian learning method is designed that conceptualises an information gain objective
to measure and force the information learned from both benign and Adversarial
Examples to be similar. This method proves that minimising this information gain
objective further tightens the bound of the difference between adversarial risk and empirical risk to move towards a basis for a principled method of adversarially training
BNNs.
Recognising the threat from backdoor or Trojan attacks against DNNs, the research
considers the problem of finding a robust defence method that is effective against Trojan
attacks. The research explores a new idea in the domain; sanitisation of inputs and
proposes Februus to neutralise highly potent and insidious Trojan attacks on DNN
systems at run-time. In Trojan attacks, an adversary activates a backdoor crafted in
a deep neural network model using a secret trigger, a Trojan, applied to any input
to alter the model’s decision to a target prediction—a target determined by and only
known to the attacker. Februus sanitises the incoming input by surgically removing the
potential trigger artifacts and restoring the input for the classification task. Februus
enables effective Trojan mitigation by sanitising inputs with no loss of performance
for sanitised inputs, trojaned or benign. This method is highly effective at defending
against advanced Trojan attack variants as well as challenging, adaptive attacks where
attackers have full knowledge of the defence method.
Investigating the connections between Trojan attacks and spatially constrained
Adversarial Examples or so-called Adversarial Patches in the input space, the research
exposes an emerging threat; an attack exploiting the vulnerability of a DNN to generate
naturalistic adversarial patches as universal triggers. For the first time, a method based
on Generative Adversarial Networks is developed to exploit a GAN’s latent space to
search for universal naturalistic adversarial patches. The proposed attack’s advantage
is its ability to exert a high level of control, enabling attackers to craft naturalistic
adversarial patches that are highly effective, robust against state-of-the-art DNNs, and
deployable in the physical world without needing to interfere with the model building
process or risking discovery. Until now, this has only been demonstrably possible
using Trojan attack methods.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 202