Search CORE

1,142 research outputs found

Dense Associative Memory is Robust to Adversarial Inputs

Author: Hopfield John J
Krotov Dmitry
Publication venue: 'MIT Press - Journals'
Publication date: 04/01/2017
Field of study

Deep neural networks (DNN) trained in a supervised way suffer from two known problems. First, the minima of the objective function used in learning correspond to data points (also known as rubbish examples or fooling images) that lack semantic similarity with the training data. Second, a clean input can be changed by a small, and often imperceptible for human vision, perturbation, so that the resulting deformed input is misclassified by the network. These findings emphasize the differences between the ways DNN and humans classify patterns, and raise a question of designing learning algorithms that more accurately mimic human perception compared to the existing methods. Our paper examines these questions within the framework of Dense Associative Memory (DAM) models. These models are defined by the energy function, with higher order (higher than quadratic) interactions between the neurons. We show that in the limit when the power of the interaction vertex in the energy function is sufficiently large, these models have the following three properties. First, the minima of the objective function are free from rubbish images, so that each minimum is a semantically meaningful pattern. Second, artificial patterns poised precisely at the decision boundary look ambiguous to human subjects and share aspects of both classes that are separated by that decision boundary. Third, adversarial images constructed by models with small power of the interaction vertex, which are equivalent to DNN with rectified linear units (ReLU), fail to transfer to and fool the models with higher order interactions. This opens up a possibility to use higher order models for detecting and stopping malicious adversarial attacks. The presented results suggest that DAM with higher order energy functions are closer to human visual perception than DNN with ReLUs

arXiv.org e-Print Archive

Princeton University Open Access Repository

Guiding the retraining of convolutional neural networks against adversarial inputs

Author: Felderer Michael
Franch Xavier
López Francisco Durán
Martínez-Fernández Silverio
Publication venue
Publication date: 12/07/2022
Field of study

Background: When using deep learning models, there are many possible vulnerabilities and some of the most worrying are the adversarial inputs, which can cause wrong decisions with minor perturbations. Therefore, it becomes necessary to retrain these models against adversarial inputs, as part of the software testing process addressing the vulnerability to these inputs. Furthermore, for an energy efficient testing and retraining, data scientists need support on which are the best guidance metrics and optimal dataset configurations. Aims: We examined four guidance metrics for retraining convolutional neural networks and three retraining configurations. Our goal is to improve the models against adversarial inputs regarding accuracy, resource utilization and time from the point of view of a data scientist in the context of image classification. Method: We conducted an empirical study in two datasets for image classification. We explore: (a) the accuracy, resource utilization and time of retraining convolutional neural networks by ordering new training set by four different guidance metrics (neuron coverage, likelihood-based surprise adequacy, distance-based surprise adequacy and random), (b) the accuracy and resource utilization of retraining convolutional neural networks with three different configurations (from scratch and augmented dataset, using weights and augmented dataset, and using weights and only adversarial inputs). Results: We reveal that retraining with adversarial inputs from original weights and by ordering with surprise adequacy metrics gives the best model w.r.t. the used metrics. Conclusions: Although more studies are necessary, we recommend data scientists to use the above configuration and metrics to deal with the vulnerability to adversarial inputs of deep learning models, as they can improve their models against adversarial inputs without using many inputs

arXiv.org e-Print Archive

DeepSearch: A Simple and Effective Blackbox Attack for Deep Neural Networks

Although deep neural networks have been very successful in image-classification tasks, they are prone to adversarial attacks. To generate adversarial inputs, there has emerged a wide variety of techniques, such as black- and whitebox attacks for neural networks. In this paper, we present DeepSearch, a novel fuzzing-based, query-efficient, blackbox attack for image classifiers. Despite its simplicity, DeepSearch is shown to be more effective in finding adversarial inputs than state-of-the-art blackbox approaches. DeepSearch is additionally able to generate the most subtle adversarial inputs in comparison to these approaches

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Guiding the retraining of convolutional neural networks against adversarial inputs

Author: Durán López Francisco Javier
Felderer Michael
Franch Gutiérrez Javier
Martínez Fernández Silverio Juan
Publication venue: PeerJ
Publication date: 08/08/2023
Field of study

Background: When using deep learning models, one of the most critical vulnerabilities is their exposure to adversarial inputs, which can cause wrong decisions (e.g., incorrect classification of an image) with minor perturbations. To address this vulnerability, it becomes necessary to retrain the affected model against adversarial inputs as part of the software testing process. In order to make this process energy efficient, data scientists need support on which are the best guidance metrics for reducing the adversarial inputs to create and use during testing, as well as optimal dataset configurations. Aim: We examined six guidance metrics for retraining deep learning models, specifically with convolutional neural network architecture, and three retraining configurations. Our goal is to improve the convolutional neural networks against the attack of adversarial inputs with regard to the accuracy, resource utilization and execution time from the point of view of a data scientist in the context of image classification. Method: We conducted an empirical study using five datasets for image classification. We explore: (a) the accuracy, resource utilization, and execution time of retraining convolutional neural networks with the guidance of six different guidance metrics (neuron coverage, likelihood-based surprise adequacy, distance-based surprise adequacy, DeepGini, softmax entropy and random), (b) the accuracy and resource utilization of retraining convolutional neural networks with three different configurations (one-step adversarial retraining, adversarial retraining and adversarial fine-tuning). Results: We reveal that adversarial retraining from original model weights, and by ordering with uncertainty metrics, gives the best model w.r.t. accuracy, resource utilization, and execution time. Conclusions: Although more studies are necessary, we recommend data scientists use the above configuration and metrics to deal with the vulnerability to adversarial inputs of deep learning models, as they can improve their models against adversarial inputs without using many inputs and without creating numerous adversarial inputs. We also show that dataset size has an important impact on the results.This work was supported by the GAISSA Spanish research project (ref. TED2021-130923B-I00; MCIN/AEI/10.13039/501100011033), the “UNAM-DGECI: Iniciación a la Investigación (verano otoño 2021)” scholarship provided by Universidad Nacional Autónoma de México (UNAM), the “Beatriz Galindo” Spanish Program BEAGAL18/00064, the Austrian Science Fund (FWF): I 4701-N and the project Continuous Testing in Production (ConTest) funded by the Austrian Research Promotion Agency (FFG): 888127.Peer ReviewedObjectius de Desenvolupament Sostenible::7 - Energia Assequible i No ContaminantObjectius de Desenvolupament Sostenible::13 - Acció per al ClimaPostprint (published version

UPCommons. Portal del coneixement obert de la UPC