481 research outputs found

    Detecting Adversarial Image Examples in Deep Networks with Adaptive Noise Reduction

    Full text link
    Recently, many studies have demonstrated deep neural network (DNN) classifiers can be fooled by the adversarial example, which is crafted via introducing some perturbations into an original sample. Accordingly, some powerful defense techniques were proposed. However, existing defense techniques often require modifying the target model or depend on the prior knowledge of attacks. In this paper, we propose a straightforward method for detecting adversarial image examples, which can be directly deployed into unmodified off-the-shelf DNN models. We consider the perturbation to images as a kind of noise and introduce two classic image processing techniques, scalar quantization and smoothing spatial filter, to reduce its effect. The image entropy is employed as a metric to implement an adaptive noise reduction for different kinds of images. Consequently, the adversarial example can be effectively detected by comparing the classification results of a given sample and its denoised version, without referring to any prior knowledge of attacks. More than 20,000 adversarial examples against some state-of-the-art DNN models are used to evaluate the proposed method, which are crafted with different attack techniques. The experiments show that our detection method can achieve a high overall F1 score of 96.39% and certainly raises the bar for defense-aware attacks.Comment: 14 pages, http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8482346&isnumber=435869

    Combatting Adversarial Attacks through Denoising and Dimensionality Reduction: A Cascaded Autoencoder Approach

    Full text link
    Machine Learning models are vulnerable to adversarial attacks that rely on perturbing the input data. This work proposes a novel strategy using Autoencoder Deep Neural Networks to defend a machine learning model against two gradient-based attacks: The Fast Gradient Sign attack and Fast Gradient attack. First we use an autoencoder to denoise the test data, which is trained with both clean and corrupted data. Then, we reduce the dimension of the denoised data using the hidden layer representation of another autoencoder. We perform this experiment for multiple values of the bound of adversarial perturbations, and consider different numbers of reduced dimensions. When the test data is preprocessed using this cascaded pipeline, the tested deep neural network classifier yields a much higher accuracy, thus mitigating the effect of the adversarial perturbation.Comment: 7 pages, 8 figures, submitted to Conference on Information Sciences and Systems (CISS 2019

    Evaluating and Improving Adversarial Robustness of Machine Learning-Based Network Intrusion Detectors

    Full text link
    Machine learning (ML), especially deep learning (DL) techniques have been increasingly used in anomaly-based network intrusion detection systems (NIDS). However, ML/DL has shown to be extremely vulnerable to adversarial attacks, especially in such security-sensitive systems. Many adversarial attacks have been proposed to evaluate the robustness of ML-based NIDSs. Unfortunately, existing attacks mostly focused on feature-space and/or white-box attacks, which make impractical assumptions in real-world scenarios, leaving the study on practical gray/black-box attacks largely unexplored. To bridge this gap, we conduct the first systematic study of the gray/black-box traffic-space adversarial attacks to evaluate the robustness of ML-based NIDSs. Our work outperforms previous ones in the following aspects: (i) practical-the proposed attack can automatically mutate original traffic with extremely limited knowledge and affordable overhead while preserving its functionality; (ii) generic-the proposed attack is effective for evaluating the robustness of various NIDSs using diverse ML/DL models and non-payload-based features; (iii) explainable-we propose an explanation method for the fragile robustness of ML-based NIDSs. Based on this, we also propose a defense scheme against adversarial attacks to improve system robustness. We extensively evaluate the robustness of various NIDSs using diverse feature sets and ML/DL models. Experimental results show our attack is effective (e.g., >97% evasion rate in half cases for Kitsune, a state-of-the-art NIDS) with affordable execution cost and the proposed defense method can effectively mitigate such attacks (evasion rate is reduced by >50% in most cases).Comment: This article has been accepted for publication by IEEE JSA

    Defense Methods Against Adversarial Examples for Recurrent Neural Networks

    Full text link
    Adversarial examples are known to mislead deep learning models to incorrectly classify them, even in domains where such models achieve state-of-the-art performance. Until recently, research on both attack and defense methods focused on image recognition, primarily using convolutional neural networks (CNNs). In recent years, adversarial example generation methods for recurrent neural networks (RNNs) have been published, demonstrating that RNN classifiers are also vulnerable to such attacks. In this paper, we present a novel defense method, termed sequence squeezing, to make RNN classifiers more robust against such attacks. Our method differs from previous defense methods which were designed only for non-sequence based models. We also implement four additional RNN defense methods inspired by recently published CNN defense methods. We evaluate our methods against state-of-the-art attacks in the cyber security domain where real adversaries (malware developers) exist, but our methods can be applied against other discrete sequence based adversarial attacks, e.g., in the NLP domain. Using our methods we were able to decrease the effectiveness of such attack from 99.9% to 15%.Comment: Submitted as a conference paper to Euro S&P 202

    Adequacy of the Gradient-Descent Method for Classifier Evasion Attacks

    Full text link
    Despite the wide use of machine learning in adversarial settings including computer security, recent studies have demonstrated vulnerabilities to evasion attacks---carefully crafted adversarial samples that closely resemble legitimate instances, but cause misclassification. In this paper, we examine the adequacy of the leading approach to generating adversarial samples---the gradient descent approach. In particular (1) we perform extensive experiments on three datasets, MNIST, USPS and Spambase, in order to analyse the effectiveness of the gradient-descent method against non-linear support vector machines, and conclude that carefully reduced kernel smoothness can significantly increase robustness to the attack; (2) we demonstrate that separated inter-class support vectors lead to more secure models, and propose a quantity similar to margin that can efficiently predict potential susceptibility to gradient-descent attacks, before the attack is launched; and (3) we design a new adversarial sample construction algorithm based on optimising the multiplicative ratio of class decision functions.Comment: 10 pages, 7 figures, 10 table

    Malware Evasion Attack and Defense

    Full text link
    Machine learning (ML) classifiers are vulnerable to adversarial examples. An adversarial example is an input sample which is slightly modified to induce misclassification in an ML classifier. In this work, we investigate white-box and grey-box evasion attacks to an ML-based malware detector and conduct performance evaluations in a real-world setting. We compare the defense approaches in mitigating the attacks. We propose a framework for deploying grey-box and black-box attacks to malware detection systems.Comment: Accepted by IEEE DSN-DSML201

    A Robust Approach for Securing Audio Classification Against Adversarial Attacks

    Full text link
    Adversarial audio attacks can be considered as a small perturbation unperceptive to human ears that is intentionally added to the audio signal and causes a machine learning model to make mistakes. This poses a security concern about the safety of machine learning models since the adversarial attacks can fool such models toward the wrong predictions. In this paper we first review some strong adversarial attacks that may affect both audio signals and their 2D representations and evaluate the resiliency of the most common machine learning model, namely deep learning models and support vector machines (SVM) trained on 2D audio representations such as short time Fourier transform (STFT), discrete wavelet transform (DWT) and cross recurrent plot (CRP) against several state-of-the-art adversarial attacks. Next, we propose a novel approach based on pre-processed DWT representation of audio signals and SVM to secure audio systems against adversarial attacks. The proposed architecture has several preprocessing modules for generating and enhancing spectrograms including dimension reduction and smoothing. We extract features from small patches of the spectrograms using speeded up robust feature (SURF) algorithm which are further used to generate a codebook using the K-Means++ algorithm. Finally, codewords are used to train a SVM on the codebook of the SURF-generated vectors. All these steps yield to a novel approach for audio classification that provides a good trade-off between accuracy and resilience. Experimental results on three environmental sound datasets show the competitive performance of proposed approach compared to the deep neural networks both in terms of accuracy and robustness against strong adversarial attacks.Comment: Paper Accepted for Publication in IEEE Transactions on Information Forensics and Securit

    Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

    Full text link
    Neural networks are known to be vulnerable to adversarial examples: inputs that are close to natural inputs but classified incorrectly. In order to better understand the space of adversarial examples, we survey ten recent proposals that are designed for detection and compare their efficacy. We show that all can be defeated by constructing new loss functions. We conclude that adversarial examples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not. Finally, we propose several simple guidelines for evaluating future proposed defenses

    Detecting Adversarial Perturbations Through Spatial Behavior in Activation Spaces

    Full text link
    Neural network based classifiers are still prone to manipulation through adversarial perturbations. State of the art attacks can overcome most of the defense or detection mechanisms suggested so far, and adversaries have the upper hand in this arms race. Adversarial examples are designed to resemble the normal input from which they were constructed, while triggering an incorrect classification. This basic design goal leads to a characteristic spatial behavior within the context of Activation Spaces, a term coined by the authors to refer to the hyperspaces formed by the activation values of the network's layers. Within the output of the first layers of the network, an adversarial example is likely to resemble normal instances of the source class, while in the final layers such examples will diverge towards the adversary's target class. The steps below enable us to leverage this inherent shift from one class to another in order to form a novel adversarial example detector. We construct Euclidian spaces out of the activation values of each of the deep neural network layers. Then, we induce a set of k-nearest neighbor classifiers (k-NN), one per activation space of each neural network layer, using the non-adversarial examples. We leverage those classifiers to produce a sequence of class labels for each nonperturbed input sample and estimate the a priori probability for a class label change between one activation space and another. During the detection phase we compute a sequence of classification labels for each input using the trained classifiers. We then estimate the likelihood of those classification sequences and show that adversarial sequences are far less likely than normal ones. We evaluated our detection method against the state of the art C&W attack method, using two image classification datasets (MNIST, CIFAR-10) reaching an AUC 0f 0.95 for the CIFAR-10 dataset

    Adversarial Security Attacks and Perturbations on Machine Learning and Deep Learning Methods

    Full text link
    The ever-growing big data and emerging artificial intelligence (AI) demand the use of machine learning (ML) and deep learning (DL) methods. Cybersecurity also benefits from ML and DL methods for various types of applications. These methods however are susceptible to security attacks. The adversaries can exploit the training and testing data of the learning models or can explore the workings of those models for launching advanced future attacks. The topic of adversarial security attacks and perturbations within the ML and DL domains is a recent exploration and a great interest is expressed by the security researchers and practitioners. The literature covers different adversarial security attacks and perturbations on ML and DL methods and those have their own presentation styles and merits. A need to review and consolidate knowledge that is comprehending of this increasingly focused and growing topic of research; however, is the current demand of the research communities. In this review paper, we specifically aim to target new researchers in the cybersecurity domain who may seek to acquire some basic knowledge on the machine learning and deep learning models and algorithms, as well as some of the relevant adversarial security attacks and perturbations
    • …
    corecore