46 research outputs found

    Global Adversarial Attacks for Assessing Deep Learning Robustness

    Full text link
    It has been shown that deep neural networks (DNNs) may be vulnerable to adversarial attacks, raising the concern on their robustness particularly for safety-critical applications. Recognizing the local nature and limitations of existing adversarial attacks, we present a new type of global adversarial attacks for assessing global DNN robustness. More specifically, we propose a novel concept of global adversarial example pairs in which each pair of two examples are close to each other but have different class labels predicted by the DNN. We further propose two families of global attack methods and show that our methods are able to generate diverse and intriguing adversarial example pairs at locations far from the training or testing data. Moreover, we demonstrate that DNNs hardened using the strong projected gradient descent (PGD) based (local) adversarial training are vulnerable to the proposed global adversarial example pairs, suggesting that global robustness must be considered while training robust deep learning networks.Comment: Submitted to NeurIPS 201

    Towards Efficient and Reliable Deep Neural Networks

    Get PDF
    Deep neural networks have achieved state-of-the-art performance for various machine learning tasks in different domains such as computer vision, natural language processing, bioinformatics, speech processing, etc. Despite the success, their excessive computational and memory requirements limit their practical usability for real-time applications or in resource-limited devices. Neural network quantization has become increasingly popular due to efficient memory consumption and faster computation resulting from bit-wise operations on the quantized networks, where the objective is to learn a network while restricting the parameters (and activations) to take values from a small discrete set. Another important aspect of modern neural networks is the adversarial vulnerability and reliability of the predictions of deep neural networks. In addition to obtaining accurate predictions, it is also critical to accurately quantify the predictive uncertainty of deep neural networks in many real-world decision-making applications. Calibrating neural networks is of utmost importance when employing them in safety-critical applications where the downstream decision-making depends on the predicted probabilities. Further to this, modern machine vision algorithms have also been shown to be extremely susceptible to small and almost imperceptible perturbations of their inputs. To this end, we tackle these fundamental challenges in modern neural networks, focussing on the efficiency and reliability of neural networks. Neural network quantization is usually formulated as a constrained optimization problem and optimized via a modified version of gradient descent. To this end, first by interpreting the continuous parameters (unconstrained) as the dual of the quantized ones, we introduce a Mirror Descent (MD) framework for NN quantization. Specifically, we provide conditions on the projections (i.e., mapping from continuous to quantized ones) which enable us to derive valid mirror maps and in turn the respective MD updates. Furthermore, we present a numerically stable implementation of MD that requires storing an additional set of auxiliary variables (unconstrained), and show that it is strikingly analogous to the STE based method which is typically viewed as a ``trick'' to avoid vanishing gradients issue. Our experiments on multiple computer vision classification datasets with multiple network architectures demonstrate that our MD variants yield state-of-the-art performance. Even though quantized networks exhibit excellent generalization capabilities, their robustness properties are not well-understood. Therefore next, we systematically study the robustness of quantized networks against gradient based adversarial attacks and demonstrate that these quantized models suffer from gradient vanishing issues and show a fake sense of robustness. By attributing gradient vanishing to poor forward-backward signal propagation in the trained network, we introduce a simple temperature scaling approach to mitigate this issue while preserving the decision boundary. Experiments on multiple image classification datasets with multiple network architectures demonstrate that our temperature scaled attacks obtain near-perfect success rate on quantized networks. Finally, we introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test in which the main idea is to compare the respective cumulative probability distributions. From this, by approximating the empirical cumulative distribution using a differentiable function via splines, we obtain a recalibration function, which maps the network outputs to actual (calibrated) class assignment probabilities. We tested our method against existing calibration approaches on various image classification datasets and our spline-based recalibration approach consistently outperforms existing methods on KS error as well as other commonly used calibration measures

    Functionality-preserving adversarial machine learning for robust classification in cybersecurity and intrusion detection domains: A survey

    Get PDF
    Machine learning has become widely adopted as a strategy for dealing with a variety of cybersecurity issues, ranging from insider threat detection to intrusion and malware detection. However, by their very nature, machine learning systems can introduce vulnerabilities to a security defence whereby a learnt model is unaware of so-called adversarial examples that may intentionally result in mis-classification and therefore bypass a system. Adversarial machine learning has been a research topic for over a decade and is now an accepted but open problem. Much of the early research on adversarial examples has addressed issues related to computer vision, yet as machine learning continues to be adopted in other domains, then likewise it is important to assess the potential vulnerabilities that may occur. A key part of transferring to new domains relates to functionality-preservation, such that any crafted attack can still execute the original intended functionality when inspected by a human and/or a machine. In this literature survey, our main objective is to address the domain of adversarial machine learning attacks and examine the robustness of machine learning models in the cybersecurity and intrusion detection domains. We identify the key trends in current work observed in the literature, and explore how these relate to the research challenges that remain open for future works. Inclusion criteria were: articles related to functionality-preservation in adversarial machine learning for cybersecurity or intrusion detection with insight into robust classification. Generally, we excluded works that are not yet peer-reviewed; however, we included some significant papers that make a clear contribution to the domain. There is a risk of subjective bias in the selection of non-peer reviewed articles; however, this was mitigated by co-author review. We selected the following databases with a sizeable computer science element to search and retrieve literature: IEEE Xplore, ACM Digital Library, ScienceDirect, Scopus, SpringerLink, and Google Scholar. The literature search was conducted up to January 2022. We have striven to ensure a comprehensive coverage of the domain to the best of our knowledge. We have performed systematic searches of the literature, noting our search terms and results, and following up on all materials that appear relevant and fit within the topic domains of this review. This research was funded by the Partnership PhD scheme at the University of the West of England in collaboration with Techmodal Ltd

    Testing Robustness Against Unforeseen Adversaries

    Full text link
    Most existing adversarial defenses only measure robustness to L_p adversarial attacks. Not only are adversaries unlikely to exclusively create small L_p perturbations, adversaries are unlikely to remain fixed. Adversaries adapt and evolve their attacks; hence adversarial defenses must be robust to a broad range of unforeseen attacks. We address this discrepancy between research and reality by proposing a new evaluation framework called ImageNet-UA. Our framework enables the research community to test ImageNet model robustness against attacks not encountered during training. To create ImageNet-UA's diverse attack suite, we introduce a total of four novel adversarial attacks. We also demonstrate that, in comparison to ImageNet-UA, prevailing L_inf robustness assessments give a narrow account of model robustness. By evaluating current defenses with ImageNet-UA, we find they provide little robustness to unforeseen attacks. We hope the greater variety and realism of ImageNet-UA enables development of more robust defenses which can generalize beyond attacks seen during training

    Methods for improving robustness against adversarial machine learning attacks

    Get PDF
    Machine learning systems can improve the efficiency of real-world tasks, including in the cyber security domain; however, these models are susceptible to adversarial attacks; indeed, an arms race exists between adversaries and defenders. The benefits of these systems have been accepted without fully considering their vulnerabilities, resulting in the deployment of vulnerable machine learning models in adversarial environments. For example, intrusion detection systems are relied upon to accurately discern between malicious and benign traffic but can be fooled into allowing malware onto a networks. Robustness is the stability of performance in well-trained models facing adversarial examples. This thesis tackles the urgent problem of improving the robustness of machine learning models, enabling safer deployments in adversarial domains. The logical outputs of this research are countermeasures against adversarial examples. Original contributions to knowledge are: a survey of adversarial machine learning in the cyber security domain, a generalizable approach for feature vulnerability and robustness assessment, and a constraint-based method of generating transferable functionality-preserving adversarial examples in an intrusion detection domain. Novel defences against adversarial examples are presented: Feature selection with recursive feature elimination, and hierarchical classification. Machine learning classifiers can be used in both visual and non-visual domains. Most research in adversarial machine learning considers the visual domain. A primary focus of this work is how adversarial attacks can be effectively used in non-visual domains, such as cyber security. For example, attackers may exploit weaknesses in an intrusion detection system classifier, enabling an intrusion to masquerade as benign traffic. Easily fooled systems are of limited use in critical areas such as cyber security. In future, more sophisticated adversarial attacks could be used by ransomware and malware authors to evade detection by machine learning Intrusion Detection Systems. Experiments in this thesis focus on intrusion detection case studies and use Python code and Python libraries: the CleverHans API, and the Adversarial Robustness Toolkit libraries to generate adversarial examples, and the HiClass library to facilitate Hierarchical Classification. An adversarial arms race is playing out in cyber security. Every time defences are improved, adversaries find new ways to breach networks. Currently, one of the most critical holes in defences are adversarial examples. This thesis examines the problem of robustness against adversarial examples for machine learning systems and contributes novel countermeasures, aiming to enable the deployment of machine learning in critical domains
    corecore