4 research outputs found

    AdvGAN++ : Harnessing latent layers for adversary generation

    Get PDF
    Adversarial examples are fabricated examples, indistinguishable from the original image that mislead neural networks and drastically lower their performance. Recently proposed AdvGAN, a GAN based approach, takes input image as a prior for generating adversaries to target a model. In this work, we show how latent features can serve as better priors than input images for adversary generation by proposing AdvGAN++, a version of AdvGAN that achieves higher attack rates than AdvGAN and at the same time generates perceptually realistic images on MNIST and CIFAR10 datasets

    Adversarial Camouflage for Node Injection Attack on Graphs

    Full text link
    Node injection attacks against Graph Neural Networks (GNNs) have received emerging attention as a practical attack scenario, where the attacker injects malicious nodes instead of modifying node features or edges to degrade the performance of GNNs. Despite the initial success of node injection attacks, we find that the injected nodes by existing methods are easy to be distinguished from the original normal nodes by defense methods and limiting their attack performance in practice. To solve the above issues, we devote to camouflage node injection attack, i.e., camouflaging injected malicious nodes (structure/attributes) as the normal ones that appear legitimate/imperceptible to defense methods. The non-Euclidean nature of graph data and the lack of human prior brings great challenges to the formalization, implementation, and evaluation of camouflage on graphs. In this paper, we first propose and formulate the camouflage of injected nodes from both the fidelity and diversity of the ego networks centered around injected nodes. Then, we design an adversarial CAmouflage framework for Node injection Attack, namely CANA, to improve the camouflage while ensuring the attack performance. Several novel indicators for graph camouflage are further designed for a comprehensive evaluation. Experimental results demonstrate that when equipping existing node injection attack methods with our proposed CANA framework, the attack performance against defense methods as well as node camouflage is significantly improved

    Towards Robust Deep Neural Networks

    Get PDF
    Deep neural networks (DNNs) enable state-of-the-art performance for most machine learning tasks. Unfortunately, they are vulnerable to attacks, such as Trojans during training and Adversarial Examples at test time. Adversarial Examples are inputs with carefully crafted perturbations added to benign samples. In the Computer Vision domain, while the perturbations being imperceptible to humans, Adversarial Examples can successfully misguide or fool DNNs. Meanwhile, Trojan or backdoor attacks involve attackers tampering with the training process, for example, to inject poisoned training data to embed a backdoor into the network that can be activated during model deployment when the Trojan triggers (known only to the attackers) appear in the model’s inputs. This dissertation investigates methods of building robust DNNs against these training-time and test-time threats. Recognising the threat of Adversarial Examples in the malware domain, this research considers the problem of realising a robust DNN-based malware detector against Adversarial Example attacks by developing a Bayesian adversarial learning algorithm. In contrast to vision tasks, adversarial learning in a domain without a differentiable or invertible mapping function from the problemspace (such as software code inputs) to the feature space is hard. The study proposes an alternative; performing adversarial learning in the feature space and proving the projection of perturbed yet, valid malware, in the problem space into the feature space will be a subset of feature-space adversarial attacks. The Bayesian approach improves benign performance, provably bounds the difference between adversarial risk and empirical risk and improves robustness against increasingly large attack budgets not employed during training. To investigate the problem of improving the robustness of DNNs against Adversarial Examples–carefully crafted perturbation added to inputs—in the Computer Vision domain, the research considers the problem of developing a Bayesian learning algorithm to realise a robust DNN against Adversarial Examples in the CV domain. Accordingly, a novel Bayesian learning method is designed that conceptualises an information gain objective to measure and force the information learned from both benign and Adversarial Examples to be similar. This method proves that minimising this information gain objective further tightens the bound of the difference between adversarial risk and empirical risk to move towards a basis for a principled method of adversarially training BNNs. Recognising the threat from backdoor or Trojan attacks against DNNs, the research considers the problem of finding a robust defence method that is effective against Trojan attacks. The research explores a new idea in the domain; sanitisation of inputs and proposes Februus to neutralise highly potent and insidious Trojan attacks on DNN systems at run-time. In Trojan attacks, an adversary activates a backdoor crafted in a deep neural network model using a secret trigger, a Trojan, applied to any input to alter the model’s decision to a target prediction—a target determined by and only known to the attacker. Februus sanitises the incoming input by surgically removing the potential trigger artifacts and restoring the input for the classification task. Februus enables effective Trojan mitigation by sanitising inputs with no loss of performance for sanitised inputs, trojaned or benign. This method is highly effective at defending against advanced Trojan attack variants as well as challenging, adaptive attacks where attackers have full knowledge of the defence method. Investigating the connections between Trojan attacks and spatially constrained Adversarial Examples or so-called Adversarial Patches in the input space, the research exposes an emerging threat; an attack exploiting the vulnerability of a DNN to generate naturalistic adversarial patches as universal triggers. For the first time, a method based on Generative Adversarial Networks is developed to exploit a GAN’s latent space to search for universal naturalistic adversarial patches. The proposed attack’s advantage is its ability to exert a high level of control, enabling attackers to craft naturalistic adversarial patches that are highly effective, robust against state-of-the-art DNNs, and deployable in the physical world without needing to interfere with the model building process or risking discovery. Until now, this has only been demonstrably possible using Trojan attack methods.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 202
    corecore