1,596 research outputs found
A spectral multi-resolution image encoding network
After a short introduction into traditional image transform coding, multirate systems and multiscale signal coding the paper focuses on the subject of image encoding by a neural network. Taking also noise into account a network model is proposed which not only learns the optimal localized basis functions for the transform but also learns to implement a whitening filter by multi-resolution encoding. A simulation showing the multi-resolution capabilitys concludes the contribution
SparseNN: An Energy-Efficient Neural Network Accelerator Exploiting Input and Output Sparsity
Contemporary Deep Neural Network (DNN) contains millions of synaptic
connections with tens to hundreds of layers. The large computation and memory
requirements pose a challenge to the hardware design. In this work, we leverage
the intrinsic activation sparsity of DNN to substantially reduce the execution
cycles and the energy consumption. An end-to-end training algorithm is proposed
to develop a lightweight run-time predictor for the output activation sparsity
on the fly. From our experimental results, the computation overhead of the
prediction phase can be reduced to less than 5% of the original feedforward
phase with negligible accuracy loss. Furthermore, an energy-efficient hardware
architecture, SparseNN, is proposed to exploit both the input and output
sparsity. SparseNN is a scalable architecture with distributed memories and
processing elements connected through a dedicated on-chip network. Compared
with the state-of-the-art accelerators which only exploit the input sparsity,
SparseNN can achieve a 10%-70% improvement in throughput and a power reduction
of around 50%
Model compression as constrained optimization, with application to neural nets. Part II: quantization
We consider the problem of deep neural net compression by quantization: given
a large, reference net, we want to quantize its real-valued weights using a
codebook with entries so that the training loss of the quantized net is
minimal. The codebook can be optimally learned jointly with the net, or fixed,
as for binarization or ternarization approaches. Previous work has quantized
the weights of the reference net, or incorporated rounding operations in the
backpropagation algorithm, but this has no guarantee of converging to a
loss-optimal, quantized net. We describe a new approach based on the recently
proposed framework of model compression as constrained optimization
\citep{Carreir17a}. This results in a simple iterative "learning-compression"
algorithm, which alternates a step that learns a net of continuous weights with
a step that quantizes (or binarizes/ternarizes) the weights, and is guaranteed
to converge to local optimum of the loss for quantized nets. We develop
algorithms for an adaptive codebook or a (partially) fixed codebook. The latter
includes binarization, ternarization, powers-of-two and other important
particular cases. We show experimentally that we can achieve much higher
compression rates than previous quantization work (even using just 1 bit per
weight) with negligible loss degradation.Comment: 33 pages, 15 figures, 3 table
Automated Pruning for Deep Neural Network Compression
In this work we present a method to improve the pruning step of the current
state-of-the-art methodology to compress neural networks. The novelty of the
proposed pruning technique is in its differentiability, which allows pruning to
be performed during the backpropagation phase of the network training. This
enables an end-to-end learning and strongly reduces the training time. The
technique is based on a family of differentiable pruning functions and a new
regularizer specifically designed to enforce pruning. The experimental results
show that the joint optimization of both the thresholds and the network weights
permits to reach a higher compression rate, reducing the number of weights of
the pruned network by a further 14% to 33% compared to the current
state-of-the-art. Furthermore, we believe that this is the first study where
the generalization capabilities in transfer learning tasks of the features
extracted by a pruned network are analyzed. To achieve this goal, we show that
the representations learned using the proposed pruning methodology maintain the
same effectiveness and generality of those learned by the corresponding
non-compressed network on a set of different recognition tasks.Comment: 8 pages, 5 figures. Published as a conference paper at ICPR 201
FPGA-Based Low-Power Speech Recognition with Recurrent Neural Networks
In this paper, a neural network based real-time speech recognition (SR)
system is developed using an FPGA for very low-power operation. The implemented
system employs two recurrent neural networks (RNNs); one is a
speech-to-character RNN for acoustic modeling (AM) and the other is for
character-level language modeling (LM). The system also employs a statistical
word-level LM to improve the recognition accuracy. The results of the AM, the
character-level LM, and the word-level LM are combined using a fairly simple
N-best search algorithm instead of the hidden Markov model (HMM) based network.
The RNNs are implemented using massively parallel processing elements (PEs) for
low latency and high throughput. The weights are quantized to 6 bits to store
all of them in the on-chip memory of an FPGA. The proposed algorithm is
implemented on a Xilinx XC7Z045, and the system can operate much faster than
real-time.Comment: Accepted to SiPS 201
Bitwise Neural Networks
Based on the assumption that there exists a neural network that efficiently
represents a set of Boolean functions between all binary inputs and outputs, we
propose a process for developing and deploying neural networks whose weight
parameters, bias terms, input, and intermediate hidden layer output signals,
are all binary-valued, and require only basic bit logic for the feedforward
pass. The proposed Bitwise Neural Network (BNN) is especially suitable for
resource-constrained environments, since it replaces either floating or
fixed-point arithmetic with significantly more efficient bitwise operations.
Hence, the BNN requires for less spatial complexity, less memory bandwidth, and
less power consumption in hardware. In order to design such networks, we
propose to add a few training schemes, such as weight compression and noisy
backpropagation, which result in a bitwise network that performs almost as well
as its corresponding real-valued network. We test the proposed network on the
MNIST dataset, represented using binary features, and show that BNNs result in
competitive performance while offering dramatic computational savings.Comment: This paper was presented at the International Conference on Machine
Learning (ICML) Workshop on Resource-Efficient Machine Learning, Lille,
France, Jul. 6-11, 201
Winner-Relaxing Self-Organizing Maps
A new family of self-organizing maps, the Winner-Relaxing Kohonen Algorithm,
is introduced as a generalization of a variant given by Kohonen in 1991. The
magnification behaviour is calculated analytically. For the original variant a
magnification exponent of 4/7 is derived; the generalized version allows to
steer the magnification in the wide range from exponent 1/2 to 1 in the
one-dimensional case, thus provides optimal mapping in the sense of information
theory. The Winner Relaxing Algorithm requires minimal extra computations per
learning step and is conveniently easy to implement.Comment: 14 pages (6 figs included). To appear in Neural Computatio
Rough Neural Networks Architecture For Improving Generalization In Pattern Recognition
Neural networks are found to be attractive trainable machines for pattern recognition.
The capability of these models to accommodate wide variety and variability of
conditions, and the ability to imitate brain functions, make them popular research
area.
This research focuses on developing hybrid rough neural networks. These novel
approaches are assumed to provide superior performance with respect to detection
and automatic target recognition.In this thesis, hybrid architectures of rough set theory and neural networks have been
investigated, developed, and implemented. The first hybrid approach provides novel
neural network referred to as Rough Shared weight Neural Networks (RSNN). It uses
the concept of approximation based on rough neurons to feature extraction, and
experiences the methodology of weight sharing. The network stages are a feature
extraction network, and a classification network. The extraction network is
composed of rough neurons that accounts for the upper and lower approximations
and embeds a membership function to replace ordinary activation functions. The
neural network learns the rough set’s upper and lower approximations as feature
extractors simultaneously with classification. The RSNN implements a novel
approximation transform. The basic design for the network is provided together with
the learning rules. The architecture provides a novel method to pattern recognition
and is expected to be robust to any pattern recognition problem.
The second hybrid approach is a two stand alone subsystems, referred to as Rough
Neural Networks (RNN). The extraction network extracts detectors that represent
pattern’s classes to be supplied to the classification network. It works as a filter for
original distilled features based on equivalence relations and rough set reduction,
while the second is responsible for classification of the outputs from the first system.
The two approaches were applied to image pattern recognition problems. The RSNN
was applied to automatic target recognition problem. The data is Synthetic Aperture
Radar (SAR) image scenes of tanks, and background. The RSNN provides a novel
methodology for designing nonlinear filters without prior knowledge of the problem domain. The RNN was used to detect patterns present in satellite image. A novel
feature extraction algorithm was developed to extract the feature vectors. The
algorithm enhances the recognition ability of the system compared to manual
extraction and labeling of pattern classes. The performance of the rough
backpropagation network is improved compared to backpropagation of the same
architecture. The network has been designed to produce detection plane for the
desired pattern.
The hybrid approaches developed in this thesis provide novel techniques to
recognition static and dynamic representation of patterns. In both domains the rough
set theory improved generalization of the neural networks paradigms. The
methodologies are theoretically robust to any pattern recognition problem, and are
proved practically for image environments
A network of spiking neurons for computing sparse representations in an energy efficient way
Computing sparse redundant representations is an important problem both in
applied mathematics and neuroscience. In many applications, this problem must
be solved in an energy efficient way. Here, we propose a hybrid distributed
algorithm (HDA), which solves this problem on a network of simple nodes
communicating via low-bandwidth channels. HDA nodes perform both
gradient-descent-like steps on analog internal variables and
coordinate-descent-like steps via quantized external variables communicated to
each other. Interestingly, such operation is equivalent to a network of
integrate-and-fire neurons, suggesting that HDA may serve as a model of neural
computation. We show that the numerical performance of HDA is on par with
existing algorithms. In the asymptotic regime the representation error of HDA
decays with time, t, as 1/t. HDA is stable against time-varying noise,
specifically, the representation error decays as 1/sqrt(t) for Gaussian white
noise.Comment: 5 figures Early Access:
http://www.mitpressjournals.org/doi/abs/10.1162/NECO_a_0035
A flexible, extensible software framework for model compression based on the LC algorithm
We propose a software framework based on the ideas of the
Learning-Compression (LC) algorithm, that allows a user to compress a neural
network or other machine learning model using different compression schemes
with minimal effort. Currently, the supported compressions include pruning,
quantization, low-rank methods (including automatically learning the layer
ranks), and combinations of those, and the user can choose different
compression types for different parts of a neural network.
The LC algorithm alternates two types of steps until convergence: a learning
(L) step, which trains a model on a dataset (using an algorithm such as SGD);
and a compression (C) step, which compresses the model parameters (using a
compression scheme such as low-rank or quantization). This decoupling of the
"machine learning" aspect from the "signal compression" aspect means that
changing the model or the compression type amounts to calling the corresponding
subroutine in the L or C step, respectively. The library fully supports this by
design, which makes it flexible and extensible. This does not come at the
expense of performance: the runtime needed to compress a model is comparable to
that of training the model in the first place; and the compressed model is
competitive in terms of prediction accuracy and compression ratio with other
algorithms (which are often specialized for specific models or compression
schemes). The library is written in Python and PyTorch and available in Github.Comment: 15 pages, 4 figures, 2 table
- …