95 research outputs found
Systematic Weight Pruning of DNNs using Alternating Direction Method of Multipliers
We present a systematic weight pruning framework of deep neural networks
(DNNs) using the alternating direction method of multipliers (ADMM). We first
formulate the weight pruning problem of DNNs as a constrained nonconvex
optimization problem, and then adopt the ADMM framework for systematic weight
pruning. We show that ADMM is highly suitable for weight pruning due to the
computational efficiency it offers. We achieve a much higher compression ratio
compared with prior work while maintaining the same test accuracy, together
with a faster convergence rate. Our models are released at
https://github.com/KaiqiZhang/admm-prunin
Toward Extremely Low Bit and Lossless Accuracy in DNNs with Progressive ADMM
Weight quantization is one of the most important techniques of Deep Neural
Networks (DNNs) model compression method. A recent work using systematic
framework of DNN weight quantization with the advanced optimization algorithm
ADMM (Alternating Direction Methods of Multipliers) achieves one of
state-of-art results in weight quantization. In this work, we first extend such
ADMM-based framework to guarantee solution feasibility and we have further
developed a multi-step, progressive DNN weight quantization framework, with
dual benefits of (i) achieving further weight quantization thanks to the
special property of ADMM regularization, and (ii) reducing the search space
within each step. Extensive experimental results demonstrate the superior
performance compared with prior work. Some highlights: we derive the first
lossless and fully binarized (for all layers) LeNet-5 for MNIST; And we derive
the first fully binarized (for all layers) VGG-16 for CIFAR-10 and ResNet for
ImageNet with reasonable accuracy loss.Comment: Accepted by ICML workshop (ODML-CDNNR2019). arXiv admin note:
substantial text overlap with arXiv:1903.0976
Communication over Continuous Quantum Secure Dialogue using Einstein-Podolsky-Rosen States
With the emergence of quantum computing and quantum networks, many
communication protocols that take advantage of the unique properties of quantum
mechanics to achieve a secure bidirectional exchange of information, have been
proposed. In this study, we propose a new quantum communication protocol,
called Continuous Quantum Secure Dialogue (CQSD), that allows two parties to
continuously exchange messages without halting while ensuring the privacy of
the conversation. Compared to existing protocols, CQSD improves the efficiency
of quantum communication. In addition, we offer an implementation of the CQSD
protocol using the Qiskit framework. Finally, we conduct a security analysis of
the CQSD protocol in the context of several common forms of attack.Comment: Accepted for presentation in a poster session at QIP 202
Brain-inspired reverse adversarial examples
A human does not have to see all elephants to recognize an animal as an
elephant. On contrast, current state-of-the-art deep learning approaches
heavily depend on the variety of training samples and the capacity of the
network. In practice, the size of network is always limited and it is
impossible to access all the data samples. Under this circumstance, deep
learning models are extremely fragile to human-imperceivable adversarial
examples, which impose threats to all safety critical systems. Inspired by the
association and attention mechanisms of the human brain, we propose reverse
adversarial examples method that can greatly improve models' robustness on
unseen data. Experiments show that our reverse adversarial method can improve
accuracy on average 19.02% on ResNet18, MobileNet, and VGG16 on unseen data
transformation. Besides, the proposed method is also applicable to compressed
models and shows potential to compensate the robustness drop brought by model
quantization - an absolute 30.78% accuracy improvement.Comment: Preprin
A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM
Many model compression techniques of Deep Neural Networks (DNNs) have been
investigated, including weight pruning, weight clustering and quantization,
etc. Weight pruning leverages the redundancy in the number of weights in DNNs,
while weight clustering/quantization leverages the redundancy in the number of
bit representations of weights. They can be effectively combined in order to
exploit the maximum degree of redundancy. However, there lacks a systematic
investigation in literature towards this direction.
In this paper, we fill this void and develop a unified, systematic framework
of DNN weight pruning and clustering/quantization using Alternating Direction
Method of Multipliers (ADMM), a powerful technique in optimization theory to
deal with non-convex optimization problems. Both DNN weight pruning and
clustering/quantization, as well as their combinations, can be solved in a
unified manner. For further performance improvement in this framework, we adopt
multiple techniques including iterative weight quantization and retraining,
joint weight clustering training and centroid updating, weight clustering
retraining, etc. The proposed framework achieves significant improvements both
in individual weight pruning and clustering/quantization problems, as well as
their combinations. For weight pruning alone, we achieve 167x weight reduction
in LeNet-5, 24.7x in AlexNet, and 23.4x in VGGNet, without any accuracy loss.
For the combination of DNN weight pruning and clustering/quantization, we
achieve 1,910x and 210x storage reduction of weight data on LeNet-5 and
AlexNet, respectively, without accuracy loss. Our codes and models are released
at the link http://bit.ly/2D3F0n
Localized polarons and conductive charge carriers: understanding CaCuTiO over a broad temperature range
CaCuTiO (CCTO) has a large dielectric permittivity that is
independent of the probing frequency near the room temperature, which
complicated due to the existence of several dynamic processes. Here, we
consider the combined effects of localized charge carriers (polarons) and
thermally activated charge carriers using a recently proposed statistical model
to fit and understand the permittivity of CCTO measured at different
frequencies over the whole temperature range accessible by our experiments. We
found that the small permittivity at the lowest temperature is related to
polaron frozen, while at higher temperatures the rapid increase is associated
with the thermal excitation of polarons inducing the Maxwell-Wagner effect, and
the final increase of the permittivity is attributed to the thermally activated
conductivity. Such analysis enables us to separate the contributions from
localized polarons and conductive charge carriers and quantify their activation
energies
Adversarial Robustness vs Model Compression, or Both?
It is well known that deep neural networks (DNNs) are vulnerable to
adversarial attacks, which are implemented by adding crafted perturbations onto
benign examples. Min-max robust optimization based adversarial training can
provide a notion of security against adversarial attacks. However, adversarial
robustness requires a significantly larger capacity of the network than that
for the natural training with only benign examples. This paper proposes a
framework of concurrent adversarial training and weight pruning that enables
model compression while still preserving the adversarial robustness and
essentially tackles the dilemma of adversarial training. Furthermore, this work
studies two hypotheses about weight pruning in the conventional setting and
finds that weight pruning is essential for reducing the network model size in
the adversarial setting, training a small model from scratch even with
inherited initialization from the large model cannot achieve both adversarial
robustness and high standard accuracy. Code is available at
https://github.com/yeshaokai/Robustness-Aware-Pruning-ADMM.Comment: Accepted by ICCV 201
StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs
Weight pruning methods of DNNs have been demonstrated to achieve a good model
pruning rate without loss of accuracy, thereby alleviating the significant
computation/storage requirements of large-scale DNNs. Structured weight pruning
methods have been proposed to overcome the limitation of irregular network
structure and demonstrated actual GPU acceleration. However, in prior work the
pruning rate (degree of sparsity) and GPU acceleration are limited (to less
than 50%) when accuracy needs to be maintained. In this work,we overcome these
limitations by proposing a unified, systematic framework of structured weight
pruning for DNNs. It is a framework that can be used to induce different types
of structured sparsity, such as filter-wise, channel-wise, and shape-wise
sparsity, as well non-structured sparsity. The proposed framework incorporates
stochastic gradient descent with ADMM, and can be understood as a dynamic
regularization method in which the regularization target is analytically
updated in each iteration. Without loss of accuracy on the AlexNet model, we
achieve 2.58X and 3.65X average measured speedup on two GPUs, clearly
outperforming the prior work. The average speedups reach 3.15X and 8.52X when
allowing a moderate ac-curacy loss of 2%. In this case the model compression
for convolutional layers is 15.0X, corresponding to 11.93X measured CPU
speedup. Our experiments on ResNet model and on other data sets like UCF101 and
CIFAR-10 demonstrate the consistently higher performance of our framework
A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers
Weight pruning methods for deep neural networks (DNNs) have been investigated
recently, but prior work in this area is mainly heuristic, iterative pruning,
thereby lacking guarantees on the weight reduction ratio and convergence time.
To mitigate these limitations, we present a systematic weight pruning framework
of DNNs using the alternating direction method of multipliers (ADMM). We first
formulate the weight pruning problem of DNNs as a nonconvex optimization
problem with combinatorial constraints specifying the sparsity requirements,
and then adopt the ADMM framework for systematic weight pruning. By using ADMM,
the original nonconvex optimization problem is decomposed into two subproblems
that are solved iteratively. One of these subproblems can be solved using
stochastic gradient descent, the other can be solved analytically. Besides, our
method achieves a fast convergence rate.
The weight pruning results are very promising and consistently outperform the
prior work. On the LeNet-5 model for the MNIST data set, we achieve 71.2 times
weight reduction without accuracy loss. On the AlexNet model for the ImageNet
data set, we achieve 21 times weight reduction without accuracy loss. When we
focus on the convolutional layer pruning for computation reductions, we can
reduce the total computation by five times compared with the prior work
(achieving a total of 13.4 times weight reduction in convolutional layers). Our
models and codes are released at https://github.com/KaiqiZhang/admm-prunin
Progressive Weight Pruning of Deep Neural Networks using ADMM
Deep neural networks (DNNs) although achieving human-level performance in
many domains, have very large model size that hinders their broader
applications on edge computing devices. Extensive research work have been
conducted on DNN model compression or pruning. However, most of the previous
work took heuristic approaches. This work proposes a progressive weight pruning
approach based on ADMM (Alternating Direction Method of Multipliers), a
powerful technique to deal with non-convex optimization problems with
potentially combinatorial constraints. Motivated by dynamic programming, the
proposed method reaches extremely high pruning rate by using partial prunings
with moderate pruning rates. Therefore, it resolves the accuracy degradation
and long convergence time problems when pursuing extremely high pruning ratios.
It achieves up to 34 times pruning rate for ImageNet dataset and 167 times
pruning rate for MNIST dataset, significantly higher than those reached by the
literature work. Under the same number of epochs, the proposed method also
achieves faster convergence and higher compression rates. The codes and pruned
DNN models are released in the link bit.ly/2zxdls
- …