6,883 research outputs found
Deep Log-Likelihood Ratio Quantization
In this work, a deep learning-based method for log-likelihood ratio (LLR)
lossy compression and quantization is proposed, with emphasis on a single-input
single-output uncorrelated fading communication setting. A deep autoencoder
network is trained to compress, quantize and reconstruct the bit log-likelihood
ratios corresponding to a single transmitted symbol. Specifically, the encoder
maps to a latent space with dimension equal to the number of sufficient
statistics required to recover the inputs - equal to three in this case - while
the decoder aims to reconstruct a noisy version of the latent representation
with the purpose of modeling quantization effects in a differentiable way.
Simulation results show that, when applied to a standard rate-1/2 low-density
parity-check (LDPC) code, a finite precision compression factor of nearly three
times is achieved when storing an entire codeword, with an incurred loss of
performance lower than 0.1 dB compared to straightforward scalar quantization
of the log-likelihood ratios.Comment: Accepted for publication at EUSIPCO 2019. Camera-ready versio
Deep Learning-Based Quantization of L-Values for Gray-Coded Modulation
In this work, a deep learning-based quantization scheme for log-likelihood
ratio (L-value) storage is introduced. We analyze the dependency between the
average magnitude of different L-values from the same quadrature amplitude
modulation (QAM) symbol and show they follow a consistent ordering. Based on
this we design a deep autoencoder that jointly compresses and separately
reconstructs each L-value, allowing the use of a weighted loss function that
aims to more accurately reconstructs low magnitude inputs. Our method is shown
to be competitive with state-of-the-art maximum mutual information quantization
schemes, reducing the required memory footprint by a ratio of up to two and a
loss of performance smaller than 0.1 dB with less than two effective bits per
L-value or smaller than 0.04 dB with 2.25 effective bits. We experimentally
show that our proposed method is a universal compression scheme in the sense
that after training on an LDPC-coded Rayleigh fading scenario we can reuse the
same network without further training on other channel models and codes while
preserving the same performance benefits.Comment: Submitted to IEEE Globecom 201
Reducing the Model Order of Deep Neural Networks Using Information Theory
Deep neural networks are typically represented by a much larger number of
parameters than shallow models, making them prohibitive for small footprint
devices. Recent research shows that there is considerable redundancy in the
parameter space of deep neural networks. In this paper, we propose a method to
compress deep neural networks by using the Fisher Information metric, which we
estimate through a stochastic optimization method that keeps track of
second-order information in the network. We first remove unimportant parameters
and then use non-uniform fixed point quantization to assign more bits to
parameters with higher Fisher Information estimates. We evaluate our method on
a classification task with a convolutional neural network trained on the MNIST
data set. Experimental results show that our method outperforms existing
methods for both network pruning and quantization.Comment: To appear in ISVLSI 2016 special sessio
BitNet: Bit-Regularized Deep Neural Networks
We present a novel optimization strategy for training neural networks which
we call "BitNet". The parameters of neural networks are usually unconstrained
and have a dynamic range dispersed over all real values. Our key idea is to
limit the expressive power of the network by dynamically controlling the range
and set of values that the parameters can take. We formulate this idea using a
novel end-to-end approach that circumvents the discrete parameter space by
optimizing a relaxed continuous and differentiable upper bound of the typical
classification loss function. The approach can be interpreted as a
regularization inspired by the Minimum Description Length (MDL) principle. For
each layer of the network, our approach optimizes real-valued translation and
scaling factors and arbitrary precision integer-valued parameters (weights). We
empirically compare BitNet to an equivalent unregularized model on the MNIST
and CIFAR-10 datasets. We show that BitNet converges faster to a superior
quality solution. Additionally, the resulting model has significant savings in
memory due to the use of integer-valued parameters
Learning Hash Codes via Hamming Distance Targets
We present a powerful new loss function and training scheme for learning
binary hash codes with any differentiable model and similarity function. Our
loss function improves over prior methods by using log likelihood loss on top
of an accurate approximation for the probability that two inputs fall within a
Hamming distance target. Our novel training scheme obtains a good estimate of
the true gradient by better sampling inputs and evaluating loss terms between
all pairs of inputs in each minibatch. To fully leverage the resulting hashes,
we use multi-indexing. We demonstrate that these techniques provide large
improvements to a similarity search tasks. We report the best results to date
on competitive information retrieval tasks for ImageNet and SIFT 1M, improving
MAP from 73% to 84% and reducing query cost by a factor of 2-8, respectively.Comment: 8 pages, overhaul of our previous submission Convolutional Hashing
for Automated Scene Matchin
2PFPCE: Two-Phase Filter Pruning Based on Conditional Entropy
Deep Convolutional Neural Networks~(CNNs) offer remarkable performance of
classifications and regressions in many high-dimensional problems and have been
widely utilized in real-word cognitive applications. However, high
computational cost of CNNs greatly hinder their deployment in
resource-constrained applications, real-time systems and edge computing
platforms. To overcome this challenge, we propose a novel filter-pruning
framework, two-phase filter pruning based on conditional entropy, namely
\textit{2PFPCE}, to compress the CNN models and reduce the inference time with
marginal performance degradation. In our proposed method, we formulate filter
pruning process as an optimization problem and propose a novel filter selection
criteria measured by conditional entropy. Based on the assumption that the
representation of neurons shall be evenly distributed, we also develop a
maximum-entropy filter freeze technique that can reduce over fitting. Two
filter pruning strategies -- global and layer-wise strategies, are compared.
Our experiment result shows that combining these two strategies can achieve a
higher neural network compression ratio than applying only one of them under
the same accuracy drop threshold. Two-phase pruning, that is, combining both
global and layer-wise strategies, achieves 10 X FLOPs reduction and 46%
inference time reduction on VGG-16, with 2% accuracy drop.Comment: 8 pages, 6 figure
Discretely Relaxing Continuous Variables for tractable Variational Inference
We explore a new research direction in Bayesian variational inference with
discrete latent variable priors where we exploit Kronecker matrix algebra for
efficient and exact computations of the evidence lower bound (ELBO). The
proposed "DIRECT" approach has several advantages over its predecessors; (i) it
can exactly compute ELBO gradients (i.e. unbiased, zero-variance gradient
estimates), eliminating the need for high-variance stochastic gradient
estimators and enabling the use of quasi-Newton optimization methods; (ii) its
training complexity is independent of the number of training points, permitting
inference on large datasets; and (iii) its posterior samples consist of sparse
and low-precision quantized integers which permit fast inference on hardware
limited devices. In addition, our DIRECT models can exactly compute statistical
moments of the parameterized predictive posterior without relying on Monte
Carlo sampling. The DIRECT approach is not practical for all likelihoods,
however, we identify a popular model structure which is practical, and
demonstrate accurate inference using latent variables discretized as extremely
low-precision 4-bit quantized integers. While the ELBO computations considered
in the numerical studies require over log-likelihood evaluations,
we train on datasets with over two-million points in just seconds.Comment: Appears in the proceedings of the Advances in Neural Information
Processing Systems (NeurIPS), 2018. Full code is available at
https://github.com/treforevans/direc
Entropy-Constrained Training of Deep Neural Networks
We propose a general framework for neural network compression that is
motivated by the Minimum Description Length (MDL) principle. For that we first
derive an expression for the entropy of a neural network, which measures its
complexity explicitly in terms of its bit-size. Then, we formalize the problem
of neural network compression as an entropy-constrained optimization objective.
This objective generalizes many of the compression techniques proposed in the
literature, in that pruning or reducing the cardinality of the weight elements
of the network can be seen special cases of entropy-minimization techniques.
Furthermore, we derive a continuous relaxation of the objective, which allows
us to minimize it using gradient based optimization techniques. Finally, we
show that we can reach state-of-the-art compression results on different
network architectures and data sets, e.g. achieving x71 compression gains on a
VGG-like architecture.Comment: 8 pages, 6 figure
Structured Probabilistic Pruning for Convolutional Neural Network Acceleration
In this paper, we propose a novel progressive parameter pruning method for
Convolutional Neural Network acceleration, named Structured Probabilistic
Pruning (SPP), which effectively prunes weights of convolutional layers in a
probabilistic manner. Unlike existing deterministic pruning approaches, where
unimportant weights are permanently eliminated, SPP introduces a pruning
probability for each weight, and pruning is guided by sampling from the pruning
probabilities. A mechanism is designed to increase and decrease pruning
probabilities based on importance criteria in the training process. Experiments
show that, with 4x speedup, SPP can accelerate AlexNet with only 0.3% loss of
top-5 accuracy and VGG-16 with 0.8% loss of top-5 accuracy in ImageNet
classification. Moreover, SPP can be directly applied to accelerate
multi-branch CNN networks, such as ResNet, without specific adaptations. Our 2x
speedup ResNet-50 only suffers 0.8% loss of top-5 accuracy on ImageNet. We
further show the effectiveness of SPP on transfer learning tasks.Comment: CNN model acceleration, 13 pages, 6 figures, accepted by Proceedings
of the British Machine Vision Conference (BMVC), 2018 ora
Local Feature Detectors, Descriptors, and Image Representations: A Survey
With the advances in both stable interest region detectors and robust and
distinctive descriptors, local feature-based image or object retrieval has
become a popular research topic. %All of the local feature-based image
retrieval system involves two important processes: local feature extraction and
image representation. The other key technology for image retrieval systems is
image representation such as the bag-of-visual words (BoVW), Fisher vector, or
Vector of Locally Aggregated Descriptors (VLAD) framework. In this paper, we
review local features and image representations for image retrieval. Because
many and many methods are proposed in this area, these methods are grouped into
several classes and summarized. In addition, recent deep learning-based
approaches for image retrieval are briefly reviewed.Comment: 20 page
- …