Search CORE

1,959 research outputs found

MARGIN: Uncovering Deep Neural Networks using Graph Signal Analysis

Author: Anirudh Rushil
Bremer Timo
Sridhar Rahul
Thiagarajan Jayaraman J.
Publication venue
Publication date: 03/12/2018
Field of study

Interpretability has emerged as a crucial aspect of machine learning, aimed at providing insights into the working of complex neural networks. However, existing solutions vary vastly based on the nature of the interpretability task, with each use case requiring substantial time and effort. This paper introduces MARGIN, a simple yet general approach to address a large set of interpretability tasks ranging from identifying prototypes to explaining image predictions. MARGIN exploits ideas rooted in graph signal analysis to determine influential nodes in a graph, which are defined as those nodes that maximally describe a function defined on the graph. By carefully defining task-specific graphs and functions, we demonstrate that MARGIN outperforms existing approaches in a number of disparate interpretability challenges.Comment: Technical Repor

arXiv.org e-Print Archive

Directory of Open Access Journals

What Learned Representations and Influence Functions Can Tell Us About Adversarial Examples

Author: Dras Mark
Tonni Shakila Mahjabin
Publication venue
Publication date: 10/10/2023
Field of study

Adversarial examples, deliberately crafted using small perturbations to fool deep neural networks, were first studied in image processing and more recently in NLP. While approaches to detecting adversarial examples in NLP have largely relied on search over input perturbations, image processing has seen a range of techniques that aim to characterise adversarial subspaces over the learned representations. In this paper, we adapt two such approaches to NLP, one based on nearest neighbors and influence functions and one on Mahalanobis distances. The former in particular produces a state-of-the-art detector when compared against several strong baselines; moreover, the novel use of influence functions provides insight into how the nature of adversarial example subspaces in NLP relate to those in image processing, and also how they differ depending on the kind of NLP task.Comment: 20 pages, Accepted in IJCNLP_AACL 202

arXiv.org e-Print Archive

SpectralDefense: Detecting Adversarial Attacks on CNNs in the Fourier Domain

Author: Harder Paula
Keuper Janis
Keuper Margret
Pfreundt Franz-Josef
Publication venue
Publication date: 01/01/2021
Field of study

Despite the success of convolutional neural networks (CNNs) in many computer vision and image analysis tasks, they remain vulnerable against so-called adversarial attacks: Small, crafted perturbations in the input images can lead to false predictions. A possible defense is to detect adversarial examples. In this work, we show how analysis in the Fourier domain of input images and feature maps can be used to distinguish benign test samples from adversarial images. We propose two novel detection methods: Our first method employs the magnitude spectrum of the input images to detect an adversarial attack. This simple and robust classifier can successfully detect adversarial perturbations of three commonly used attack methods. The second method builds upon the first and additionally extracts the phase of Fourier coefficients of feature-maps at different layers of the network. With this extension, we are able to improve adversarial detection rates compared to state-of-the-art detectors on five different attack methods

arXiv.org e-Print Archive

Hochschulschriftenserver der Hochschule Offenburg

Application of Adversarial Attacks on Malware Detection Models

Author: Nagireddy Vaishnavi
Publication venue: 'San Jose State University Library'
Publication date: 01/01/2023
Field of study

Malware detection is vital as it ensures that a computer is safe from any kind of malicious software that puts users at risk. Too many variants of these malicious software are being introduced everyday at increased speed. Thus, to guarantee security of computer systems, huge advancements in the field of malware detection are made and one such approach is to use machine learning for malware detection. Even though machine learning is very powerful, it is prone to adversarial attacks. In this project, we will try to apply adversarial attacks on malware detection models. To perform these attacks, fake samples that are generated using Generative Adversarial Networks (GAN) algorithm are used and these fake malware data along with the actual data is given to a machine learning model for malware detection. Here, we will also be experimenting with the percentage of fake malware samples to be considered and observe the behavior of the model according to the given input. The novelty of this project is given by the use of adversarial samples that are generated by the implementation of word embeddings produced by our generative algorithms

SJSU ScholarWorks

Adversarial-Aware Deep Learning System based on a Secondary Classical Machine Learning Verification Approach

Author: Alghamdi Abdulmajeed
Alkhowaiter Mohammed
Alyami Mnassar
Kholidy Hisham
Zou Cliff
Publication venue
Publication date: 31/05/2023
Field of study

Deep learning models have been used in creating various effective image classification applications. However, they are vulnerable to adversarial attacks that seek to misguide the models into predicting incorrect classes. Our study of major adversarial attack models shows that they all specifically target and exploit the neural networking structures in their designs. This understanding makes us develop a hypothesis that most classical machine learning models, such as Random Forest (RF), are immune to adversarial attack models because they do not rely on neural network design at all. Our experimental study of classical machine learning models against popular adversarial attacks supports this hypothesis. Based on this hypothesis, we propose a new adversarial-aware deep learning system by using a classical machine learning model as the secondary verification system to complement the primary deep learning model in image classification. Although the secondary classical machine learning model has less accurate output, it is only used for verification purposes, which does not impact the output accuracy of the primary deep learning model, and at the same time, can effectively detect an adversarial attack when a clear mismatch occurs. Our experiments based on CIFAR-100 dataset show that our proposed approach outperforms current state-of-the-art adversarial defense systems.Comment: 17 pages, 3 figure

arXiv.org e-Print Archive