8,896 research outputs found
Model Compression with Adversarial Robustness: A Unified Optimization Framework
Deep model compression has been extensively studied, and state-of-the-art
methods can now achieve high compression ratios with minimal accuracy loss.
This paper studies model compression through a different lens: could we
compress models without hurting their robustness to adversarial attacks, in
addition to maintaining accuracy? Previous literature suggested that the goals
of robustness and compactness might sometimes contradict. We propose a novel
Adversarially Trained Model Compression (ATMC) framework. ATMC constructs a
unified constrained optimization formulation, where existing compression means
(pruning, factorization, quantization) are all integrated into the constraints.
An efficient algorithm is then developed. An extensive group of experiments are
presented, demonstrating that ATMC obtains remarkably more favorable trade-off
among model size, accuracy and robustness, over currently available
alternatives in various settings. The codes are publicly available at:
https://github.com/shupenggui/ATMC.Comment: 14 pages, NeurIPS 2019. The first two authors Gui and Wang
contributed equally and are listed alphabeticall
AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training
Highly distributed training of Deep Neural Networks (DNNs) on future compute
platforms (offering 100 of TeraOps/s of computational capacity) is expected to
be severely communication constrained. To overcome this limitation, new
gradient compression techniques are needed that are computationally friendly,
applicable to a wide variety of layers seen in Deep Neural Networks and
adaptable to variations in network architectures as well as their
hyper-parameters. In this paper we introduce a novel technique - the Adaptive
Residual Gradient Compression (AdaComp) scheme. AdaComp is based on localized
selection of gradient residues and automatically tunes the compression rate
depending on local activity. We show excellent results on a wide spectrum of
state of the art Deep Learning models in multiple domains (vision, speech,
language), datasets (MNIST, CIFAR10, ImageNet, BN50, Shakespeare), optimizers
(SGD with momentum, Adam) and network parameters (number of learners,
minibatch-size etc.). Exploiting both sparsity and quantization, we demonstrate
end-to-end compression rates of ~200X for fully-connected and recurrent layers,
and ~40X for convolutional layers, without any noticeable degradation in model
accuracies.Comment: IBM Research AI, 9 pages, 7 figures, AAAI18 accepte
Cell Detection in Microscopy Images with Deep Convolutional Neural Network and Compressed Sensing
The ability to automatically detect certain types of cells or cellular
subunits in microscopy images is of significant interest to a wide range of
biomedical research and clinical practices. Cell detection methods have evolved
from employing hand-crafted features to deep learning-based techniques. The
essential idea of these methods is that their cell classifiers or detectors are
trained in the pixel space, where the locations of target cells are labeled. In
this paper, we seek a different route and propose a convolutional neural
network (CNN)-based cell detection method that uses encoding of the output
pixel space. For the cell detection problem, the output space is the sparsely
labeled pixel locations indicating cell centers. We employ random projections
to encode the output space to a compressed vector of fixed dimension. Then, CNN
regresses this compressed vector from the input pixels. Furthermore, it is
possible to stably recover sparse cell locations on the output pixel space from
the predicted compressed vector using -norm optimization. In the past,
output space encoding using compressed sensing (CS) has been used in
conjunction with linear and non-linear predictors. To the best of our
knowledge, this is the first successful use of CNN with CS-based output space
encoding. We made substantial experiments on several benchmark datasets, where
the proposed CNN + CS framework (referred to as CNNCS) achieved the highest or
at least top-3 performance in terms of F1-score, compared with other
state-of-the-art methods
Creating Lightweight Object Detectors with Model Compression for Deployment on Edge Devices
To achieve lightweight object detectors for deployment on the edge devices,
an effective model compression pipeline is proposed in this paper. The
compression pipeline consists of automatic channel pruning for the backbone,
fixed channel deletion for the branch layers and knowledge distillation for the
guidance learning. As results, the Resnet50-v1d is auto-pruned and fine-tuned
on ImageNet to attain a compact base model as the backbone of object detector.
Then, lightweight object detectors are implemented with proposed compression
pipeline. For instance, the SSD-300 with model size=16.3MB, FLOPS=2.31G, and
mAP=71.2 is created, revealing a better result than SSD-300-MobileNet.Comment: lightweight detector, automatic channel pruning, fixed channel
deletion, knowledge distillatio
A Survey of Model Compression and Acceleration for Deep Neural Networks
Deep neural networks (DNNs) have recently achieved great success in many
visual recognition tasks. However, existing deep neural network models are
computationally expensive and memory intensive, hindering their deployment in
devices with low memory resources or in applications with strict latency
requirements. Therefore, a natural thought is to perform model compression and
acceleration in deep networks without significantly decreasing the model
performance. During the past five years, tremendous progress has been made in
this area. In this paper, we review the recent techniques for compacting and
accelerating DNN models. In general, these techniques are divided into four
categories: parameter pruning and quantization, low-rank factorization,
transferred/compact convolutional filters, and knowledge distillation. Methods
of parameter pruning and quantization are described first, after that the other
techniques are introduced. For each category, we also provide insightful
analysis about the performance, related applications, advantages, and
drawbacks. Then we go through some very recent successful methods, for example,
dynamic capacity networks and stochastic depths networks. After that, we survey
the evaluation matrices, the main datasets used for evaluating the model
performance, and recent benchmark efforts. Finally, we conclude this paper,
discuss remaining the challenges and possible directions for future work.Comment: Published in IEEE Signal Processing Magazine, updated version
including more recent work
Deep Learning Methods for Parallel Magnetic Resonance Image Reconstruction
Following the success of deep learning in a wide range of applications,
neural network-based machine learning techniques have received interest as a
means of accelerating magnetic resonance imaging (MRI). A number of ideas
inspired by deep learning techniques from computer vision and image processing
have been successfully applied to non-linear image reconstruction in the spirit
of compressed sensing for both low dose computed tomography and accelerated
MRI. The additional integration of multi-coil information to recover missing
k-space lines in the MRI reconstruction process, is still studied less
frequently, even though it is the de-facto standard for currently used
accelerated MR acquisitions. This manuscript provides an overview of the recent
machine learning approaches that have been proposed specifically for improving
parallel imaging. A general background introduction to parallel MRI is given
that is structured around the classical view of image space and k-space based
methods. Both linear and non-linear methods are covered, followed by a
discussion of recent efforts to further improve parallel imaging using machine
learning, and specifically using artificial neural networks. Image-domain based
techniques that introduce improved regularizers are covered as well as k-space
based methods, where the focus is on better interpolation strategies using
neural networks. Issues and open problems are discussed as well as recent
efforts for producing open datasets and benchmarks for the community.Comment: 14 pages, 7 figure
Deep Learning Techniques for Inverse Problems in Imaging
Recent work in machine learning shows that deep neural networks can be used
to solve a wide variety of inverse problems arising in computational imaging.
We explore the central prevailing themes of this emerging area and present a
taxonomy that can be used to categorize different problems and reconstruction
methods. Our taxonomy is organized along two central axes: (1) whether or not a
forward model is known and to what extent it is used in training and testing,
and (2) whether or not the learning is supervised or unsupervised, i.e.,
whether or not the training relies on access to matched ground truth image and
measurement pairs. We also discuss the trade-offs associated with these
different reconstruction approaches, caveats and common failure modes, plus
open problems and avenues for future work
Towards Image Understanding from Deep Compression without Decoding
Motivated by recent work on deep neural network (DNN)-based image compression
methods showing potential improvements in image quality, savings in storage,
and bandwidth reduction, we propose to perform image understanding tasks such
as classification and segmentation directly on the compressed representations
produced by these compression methods. Since the encoders and decoders in
DNN-based compression methods are neural networks with feature-maps as internal
representations of the images, we directly integrate these with architectures
for image understanding. This bypasses decoding of the compressed
representation into RGB space and reduces computational cost. Our study shows
that accuracies comparable to networks that operate on compressed RGB images
can be achieved while reducing the computational complexity up to .
Furthermore, we show that synergies are obtained by jointly training
compression networks with classification networks on the compressed
representations, improving image quality, classification accuracy, and
segmentation performance. We find that inference from compressed
representations is particularly advantageous compared to inference from
compressed RGB images for aggressive compression rates.Comment: ICLR 2018 conference pape
Brain-inspired reverse adversarial examples
A human does not have to see all elephants to recognize an animal as an
elephant. On contrast, current state-of-the-art deep learning approaches
heavily depend on the variety of training samples and the capacity of the
network. In practice, the size of network is always limited and it is
impossible to access all the data samples. Under this circumstance, deep
learning models are extremely fragile to human-imperceivable adversarial
examples, which impose threats to all safety critical systems. Inspired by the
association and attention mechanisms of the human brain, we propose reverse
adversarial examples method that can greatly improve models' robustness on
unseen data. Experiments show that our reverse adversarial method can improve
accuracy on average 19.02% on ResNet18, MobileNet, and VGG16 on unseen data
transformation. Besides, the proposed method is also applicable to compressed
models and shows potential to compensate the robustness drop brought by model
quantization - an absolute 30.78% accuracy improvement.Comment: Preprin
Sparse DNNs with Improved Adversarial Robustness
Deep neural networks (DNNs) are computationally/memory-intensive and
vulnerable to adversarial attacks, making them prohibitive in some real-world
applications. By converting dense models into sparse ones, pruning appears to
be a promising solution to reducing the computation/memory cost. This paper
studies classification models, especially DNN-based ones, to demonstrate that
there exists intrinsic relationships between their sparsity and adversarial
robustness. Our analyses reveal, both theoretically and empirically, that
nonlinear DNN-based classifiers behave differently under attacks from
some linear ones. We further demonstrate that an appropriately higher model
sparsity implies better robustness of nonlinear DNNs, whereas over-sparsified
models can be more difficult to resist adversarial examples.Comment: l1 regularization on weights --> l1 regularization on activation
- …