53 research outputs found

    Distilling Critical Paths in Convolutional Neural Networks

    Full text link
    Neural network compression and acceleration are widely demanded currently due to the resource constraints on most deployment targets. In this paper, through analyzing the filter activation, gradients, and visualizing the filters' functionality in convolutional neural networks, we show that the filters in higher layers learn extremely task-specific features, which are exclusive for only a small subset of the overall tasks, or even a single class. Based on such findings, we reveal the critical paths of information flow for different classes. And by their intrinsic property of exclusiveness, we propose a critical path distillation method, which can effectively customize the convolutional neural networks to small ones with much smaller model size and less computation.Comment: Accepted in NIPS18 CDNNRIA worksho

    DoPa: A Comprehensive CNN Detection Methodology against Physical Adversarial Attacks

    Full text link
    Recently, Convolutional Neural Networks (CNNs) demonstrate a considerable vulnerability to adversarial attacks, which can be easily misled by adversarial perturbations. With more aggressive methods proposed, adversarial attacks can be also applied to the physical world, causing practical issues to various CNN powered applications. To secure CNNs, adversarial attack detection is considered as the most critical approach. However, most existing works focus on superficial patterns and merely search a particular method to differentiate the adversarial inputs and natural inputs, ignoring the analysis of CNN inner vulnerability. Therefore, they can only target to specific physical adversarial attacks, lacking expected versatility to different attacks. To address this issue, we propose DoPa -- a comprehensive CNN detection methodology for various physical adversarial attacks. By interpreting the CNN's vulnerability, we find that non-semantic adversarial perturbations can activate CNN with significantly abnormal activations and even overwhelm other semantic input patterns' activations. Therefore, we add a self-verification stage to analyze the semantics of distinguished activation patterns, which improves the CNN recognition process. We apply such a detection methodology into both image and audio CNN recognition scenarios. Experiments show that DoPa can achieve an average rate of 90% success for image attack detection and 92% success for audio attack detection. Announcement:[The original DoPa draft on arXiv was modified and submitted to a conference already, while this short abstract was submitted only for a presentation at the KDD 2019 AIoT Workshop.]Comment: 5 pages, 3 figure

    Functionality-Oriented Convolutional Filter Pruning

    Full text link
    The sophisticated structure of Convolutional Neural Network (CNN) allows for outstanding performance, but at the cost of intensive computation. As significant redundancies inevitably present in such a structure, many works have been proposed to prune the convolutional filters for computation cost reduction. Although extremely effective, most works are based only on quantitative characteristics of the convolutional filters, and highly overlook the qualitative interpretation of individual filter's specific functionality. In this work, we interpreted the functionality and redundancy of the convolutional filters from different perspectives, and proposed a functionality-oriented filter pruning method. With extensive experiment results, we proved the convolutional filters' qualitative significance regardless of magnitude, demonstrated significant neural network redundancy due to repetitive filter functions, and analyzed the filter functionality defection under inappropriate retraining process. Such an interpretable pruning approach not only offers outstanding computation cost optimization over previous filter pruning methods, but also interprets filter pruning process

    How convolutional neural network see the world - A survey of convolutional neural network visualization methods

    Full text link
    Nowadays, the Convolutional Neural Networks (CNNs) have achieved impressive performance on many computer vision related tasks, such as object detection, image recognition, image retrieval, etc. These achievements benefit from the CNNs outstanding capability to learn the input features with deep layers of neuron structures and iterative training process. However, these learned features are hard to identify and interpret from a human vision perspective, causing a lack of understanding of the CNNs internal working mechanism. To improve the CNN interpretability, the CNN visualization is well utilized as a qualitative analysis method, which translates the internal features into visually perceptible patterns. And many CNN visualization works have been proposed in the literature to interpret the CNN in perspectives of network structure, operation, and semantic concept. In this paper, we expect to provide a comprehensive survey of several representative CNN visualization methods, including Activation Maximization, Network Inversion, Deconvolutional Neural Networks (DeconvNet), and Network Dissection based visualization. These methods are presented in terms of motivations, algorithms, and experiment results. Based on these visualization methods, we also discuss their practical applications to demonstrate the significance of the CNN interpretability in areas of network design, optimization, security enhancement, etc.Comment: 32 pages, 21 figures. Mathematical Foundations of Computin

    Demystifying Neural Network Filter Pruning

    Full text link
    Based on filter magnitude ranking (e.g. L1 norm), conventional filter pruning methods for Convolutional Neural Networks (CNNs) have been proved with great effectiveness in computation load reduction. Although effective, these methods are rarely analyzed in a perspective of filter functionality. In this work, we explore the filter pruning and the retraining through qualitative filter functionality interpretation. We find that the filter magnitude based method fails to eliminate the filters with repetitive functionality. And the retraining phase is actually used to reconstruct the remained filters for functionality compensation for the wrongly-pruned critical filters. With a proposed functionality-oriented pruning method, we further testify that, by precisely addressing the filter functionality redundancy, a CNN can be pruned without considerable accuracy drop, and the retraining phase is unnecessary

    The Global Convergence of the Alternating Minimization Algorithm for Deep Neural Network Problems

    Full text link
    In recent years, stochastic gradient descent (SGD) and its variants have been the dominant optimization methods for training deep neural networks. However, SGD suffers from limitations such as the lack of theoretical guarantees, vanishing gradients, excessive sensitivity to input, and difficulties solving highly non-smooth constraints and functions. To overcome these drawbacks, alternating minimization-based methods for deep neural network optimization have attracted fast-increasing attention recently. As an emerging and open domain, however, several new challenges need to be addressed, including: 1) there is no guarantee of global convergence under mild, practical conditions, and 2) cubic time complexity in the size of feature dimensions. We therefore propose a novel Deep Learning Alternating Minimization (DLAM) algorithm to deal with these two challenges. Our innovative inequality-constrained formulation infinitely approximates the original problem with non-convex equality constraints, enabling our proof of global convergence of the DLAM algorithm under mild, practical conditions. The time complexity is successfully reduced from O(d3)O(d^3) to O(d2)O(d^2) via a dedicated algorithm design for subproblems that is enhanced by iterative quadratic approximations and backtracking. Experiments on benchmark datasets demonstrate the effectiveness of our proposed DLAM algorithm

    Interpreting Adversarial Robustness: A View from Decision Surface in Input Space

    Full text link
    One popular hypothesis of neural network generalization is that the flat local minima of loss surface in parameter space leads to good generalization. However, we demonstrate that loss surface in parameter space has no obvious relationship with generalization, especially under adversarial settings. Through visualizing decision surfaces in both parameter space and input space, we instead show that the geometry property of decision surface in input space correlates well with the adversarial robustness. We then propose an adversarial robustness indicator, which can evaluate a neural network's intrinsic robustness property without testing its accuracy under adversarial attacks. Guided by it, we further propose our robust training method. Without involving adversarial training, our method could enhance network's intrinsic adversarial robustness against various adversarial attacks.Comment: 15 pages, submitted to ICLR 201

    Task-Adaptive Incremental Learning for Intelligent Edge Devices

    Full text link
    Convolutional Neural Networks (CNNs) are used for a wide range of image-related tasks such as image classification and object detection. However, a large pre-trained CNN model contains a lot of redundancy considering the task-specific edge applications. Also, the statically pre-trained model could not efficiently handle the dynamic data in the real-world application. The CNN training data and their labels are collected in an incremental manner. To tackle the above two challenges, we proposed TeAM a task-adaptive incremental learning framework for CNNs in intelligent edge devices. Given a pre-trained large model, TeAM can configure it into any specialized model for dedicated edge applications. The specialized model can be quickly fine-tuned with local data to achieve very high accuracy. Also, with our global aggregation and incremental learning scheme, the specialized CNN models can be collaboratively aggregated to an enhanced global model with new training data.Comment: 2 page

    ASP:A Fast Adversarial Attack Example Generation Framework based on Adversarial Saliency Prediction

    Full text link
    With the excellent accuracy and feasibility, the Neural Networks have been widely applied into the novel intelligent applications and systems. However, with the appearance of the Adversarial Attack, the NN based system performance becomes extremely vulnerable:the image classification results can be arbitrarily misled by the adversarial examples, which are crafted images with human unperceivable pixel-level perturbation. As this raised a significant system security issue, we implemented a series of investigations on the adversarial attack in this work: We first identify an image's pixel vulnerability to the adversarial attack based on the adversarial saliency analysis. By comparing the analyzed saliency map and the adversarial perturbation distribution, we proposed a new evaluation scheme to comprehensively assess the adversarial attack precision and efficiency. Then, with a novel adversarial saliency prediction method, a fast adversarial example generation framework, namely "ASP", is proposed with significant attack efficiency improvement and dramatic computation cost reduction. Compared to the previous methods, experiments show that ASP has at most 12 times speed-up for adversarial example generation, 2 times lower perturbation rate, and high attack success rate of 87% on both MNIST and Cifar10. ASP can be also well utilized to support the data-hungry NN adversarial training. By reducing the attack success rate as much as 90%, ASP can quickly and effectively enhance the defense capability of NN based system to the adversarial attacks

    Interpreting and Evaluating Neural Network Robustness

    Full text link
    Recently, adversarial deception becomes one of the most considerable threats to deep neural networks. However, compared to extensive research in new designs of various adversarial attacks and defenses, the neural networks' intrinsic robustness property is still lack of thorough investigation. This work aims to qualitatively interpret the adversarial attack and defense mechanism through loss visualization, and establish a quantitative metric to evaluate the neural network model's intrinsic robustness. The proposed robustness metric identifies the upper bound of a model's prediction divergence in the given domain and thus indicates whether the model can maintain a stable prediction. With extensive experiments, our metric demonstrates several advantages over conventional adversarial testing accuracy based robustness estimation: (1) it provides a uniformed evaluation to models with different structures and parameter scales; (2) it over-performs conventional accuracy based robustness estimation and provides a more reliable evaluation that is invariant to different test settings; (3) it can be fast generated without considerable testing cost.Comment: Accepted in IJCAI'1
    • …
    corecore