45,748 research outputs found
A Unified Approximation Framework for Compressing and Accelerating Deep Neural Networks
Deep neural networks (DNNs) have achieved significant success in a variety of
real world applications, i.e., image classification. However, tons of
parameters in the networks restrict the efficiency of neural networks due to
the large model size and the intensive computation. To address this issue,
various approximation techniques have been investigated, which seek for a light
weighted network with little performance degradation in exchange of smaller
model size or faster inference. Both low-rankness and sparsity are appealing
properties for the network approximation. In this paper we propose a unified
framework to compress the convolutional neural networks (CNNs) by combining
these two properties, while taking the nonlinear activation into consideration.
Each layer in the network is approximated by the sum of a structured sparse
component and a low-rank component, which is formulated as an optimization
problem. Then, an extended version of alternating direction method of
multipliers (ADMM) with guaranteed convergence is presented to solve the
relaxed optimization problem. Experiments are carried out on VGG-16, AlexNet
and GoogLeNet with large image classification datasets. The results outperform
previous work in terms of accuracy degradation, compression rate and speedup
ratio. The proposed method is able to remarkably compress the model (with up to
4.9x reduction of parameters) at a cost of little loss or without loss on
accuracy.Comment: 8 pages, 5 figures, 6 table
Exponential Family Matrix Completion under Structural Constraints
We consider the matrix completion problem of recovering a structured matrix
from noisy and partial measurements. Recent works have proposed tractable
estimators with strong statistical guarantees for the case where the underlying
matrix is low--rank, and the measurements consist of a subset, either of the
exact individual entries, or of the entries perturbed by additive Gaussian
noise, which is thus implicitly suited for thin--tailed continuous data.
Arguably, common applications of matrix completion require estimators for (a)
heterogeneous data--types, such as skewed--continuous, count, binary, etc., (b)
for heterogeneous noise models (beyond Gaussian), which capture varied
uncertainty in the measurements, and (c) heterogeneous structural constraints
beyond low--rank, such as block--sparsity, or a superposition structure of
low--rank plus elementwise sparseness, among others. In this paper, we provide
a vastly unified framework for generalized matrix completion by considering a
matrix completion setting wherein the matrix entries are sampled from any
member of the rich family of exponential family distributions; and impose
general structural constraints on the underlying matrix, as captured by a
general regularizer . We propose a simple convex regularized
--estimator for the generalized framework, and provide a unified and novel
statistical analysis for this general class of estimators. We finally
corroborate our theoretical results on simulated datasets.Comment: 20 pages, 9 figure
FFT-Based Deep Learning Deployment in Embedded Systems
Deep learning has delivered its powerfulness in many application domains,
especially in image and speech recognition. As the backbone of deep learning,
deep neural networks (DNNs) consist of multiple layers of various types with
hundreds to thousands of neurons. Embedded platforms are now becoming essential
for deep learning deployment due to their portability, versatility, and energy
efficiency. The large model size of DNNs, while providing excellent accuracy,
also burdens the embedded platforms with intensive computation and storage.
Researchers have investigated on reducing DNN model size with negligible
accuracy loss. This work proposes a Fast Fourier Transform (FFT)-based DNN
training and inference model suitable for embedded platforms with reduced
asymptotic complexity of both computation and storage, making our approach
distinguished from existing approaches. We develop the training and inference
algorithms based on FFT as the computing kernel and deploy the FFT-based
inference model on embedded platforms achieving extraordinary processing speed.Comment: Design, Automation, and Test in Europe (DATE) For source code, please
contact Mahdi Nazemi at <[email protected]
- …