17 research outputs found
Analyzing the noise robustness of deep neural networks
Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions. Although much work has been done on both adversarial attack and defense, a fine-grained understanding of adversarial examples is still lacking. To address this issue, we present a visual analysis method to explain why adversarial examples are misclassified. The key is to compare and analyze the datapaths of both the adversarial and normal examples. A datapath is a group of critical neurons along with their connections. We formulate the datapath extraction as a subset selection problem and solve it by constructing and training a neural network. A multi-level visualization consisting of a network-level visualization of data flows, a layer-level visualization of feature maps, and a neuron-level visualization of learned features, has been designed to help investigate how datapaths of adversarial and normal examples diverge and merge in the prediction process. A quantitative evaluation and a case study were conducted to demonstrate the promise of our method to explain the misclassification of adversarial examples
Understanding the Role of Pathways in a Deep Neural Network
Deep neural networks have demonstrated superior performance in artificial
intelligence applications, but the opaqueness of their inner working mechanism
is one major drawback in their application. The prevailing unit-based
interpretation is a statistical observation of stimulus-response data, which
fails to show a detailed internal process of inherent mechanisms of neural
networks. In this work, we analyze a convolutional neural network (CNN) trained
in the classification task and present an algorithm to extract the diffusion
pathways of individual pixels to identify the locations of pixels in an input
image associated with object classes. The pathways allow us to test the causal
components which are important for classification and the pathway-based
representations are clearly distinguishable between categories. We find that
the few largest pathways of an individual pixel from an image tend to cross the
feature maps in each layer that is important for classification. And the large
pathways of images of the same category are more consistent in their trends
than those of different categories. We also apply the pathways to understanding
adversarial attacks, object completion, and movement perception. Further, the
total number of pathways on feature maps in all layers can clearly discriminate
the original, deformed, and target samples