18 research outputs found
Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks
Deep neural networks (DNNs) have become a widely deployed model for numerous
machine learning applications. However, their fixed architecture, substantial
training cost, and significant model redundancy make it difficult to
efficiently update them to accommodate previously unseen data. To solve these
problems, we propose an incremental learning framework based on a
grow-and-prune neural network synthesis paradigm. When new data arrive, the
neural network first grows new connections based on the gradients to increase
the network capacity to accommodate new data. Then, the framework iteratively
prunes away connections based on the magnitude of weights to enhance network
compactness, and hence recover efficiency. Finally, the model rests at a
lightweight DNN that is both ready for inference and suitable for future
grow-and-prune updates. The proposed framework improves accuracy, shrinks
network size, and significantly reduces the additional training cost for
incoming data compared to conventional approaches, such as training from
scratch and network fine-tuning. For the LeNet-300-100 and LeNet-5 neural
network architectures derived for the MNIST dataset, the framework reduces
training cost by up to 64% (63%) and 67% (63%) compared to training from
scratch (network fine-tuning), respectively. For the ResNet-18 architecture
derived for the ImageNet dataset and DeepSpeech2 for the AN4 dataset, the
corresponding training cost reductions against training from scratch (network
fine-tunning) are 64% (60%) and 67% (62%), respectively. Our derived models
contain fewer network parameters but achieve higher accuracy relative to
conventional baselines
SCANN: Synthesis of Compact and Accurate Neural Networks
Deep neural networks (DNNs) have become the driving force behind recent
artificial intelligence (AI) research. An important problem with implementing a
neural network is the design of its architecture. Typically, such an
architecture is obtained manually by exploring its hyperparameter space and
kept fixed during training. This approach is time-consuming and inefficient.
Another issue is that modern neural networks often contain millions of
parameters, whereas many applications and devices require small inference
models. However, efforts to migrate DNNs to such devices typically entail a
significant loss of classification accuracy. To address these challenges, we
propose a two-step neural network synthesis methodology, called DR+SCANN, that
combines two complementary approaches to design compact and accurate DNNs. At
the core of our framework is the SCANN methodology that uses three basic
architecture-changing operations, namely connection growth, neuron growth, and
connection pruning, to synthesize feed-forward architectures with arbitrary
structure. SCANN encapsulates three synthesis methodologies that apply a
repeated grow-and-prune paradigm to three architectural starting points.
DR+SCANN combines the SCANN methodology with dataset dimensionality reduction
to alleviate the curse of dimensionality. We demonstrate the efficacy of SCANN
and DR+SCANN on various image and non-image datasets. We evaluate SCANN on
MNIST and ImageNet benchmarks. In addition, we also evaluate the efficacy of
using dimensionality reduction alongside SCANN (DR+SCANN) on nine small to
medium-size datasets. We also show that our synthesis methodology yields neural
networks that are much better at navigating the accuracy vs. energy efficiency
space. This would enable neural network-based inference even on
Internet-of-Things sensors.Comment: 13 pages, 8 figure
Synthesis and Pruning as a Dynamic Compression Strategy for Efficient Deep Neural Networks
The brain is a highly reconfigurable machine capable of task-specific
adaptations. The brain continually rewires itself for a more optimal
configuration to solve problems. We propose a novel strategic synthesis
algorithm for feedforward networks that draws directly from the brain's
behaviours when learning. The proposed approach analyses the network and ranks
weights based on their magnitude. Unlike existing approaches that advocate
random selection, we select highly performing nodes as starting points for new
edges and exploit the Gaussian distribution over the weights to select
corresponding endpoints. The strategy aims only to produce useful connections
and result in a smaller residual network structure. The approach is
complemented with pruning to further the compression. We demonstrate the
techniques to deep feedforward networks. The residual sub-networks that are
formed from the synthesis approaches in this work form common sub-networks with
similarities up to ~90%. Using pruning as a complement to the strategic
synthesis approach, we observe improvements in compression.Comment: 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE
MANAGEMENT, 9th International Symposium DATAMOD 2020 FROM DATA TO MODELS AND
BACK, 16 Pages, 7 Figures, 3 Tables, 2 Equation
Activity Sparsity Complements Weight Sparsity for Efficient RNN Inference
Artificial neural networks open up unprecedented machine learning
capabilities at the cost of ever growing computational requirements.
Sparsifying the parameters, often achieved through weight pruning, has been
identified as a powerful technique to compress the number of model parameters
and reduce the computational operations of neural networks. Yet, sparse
activations, while omnipresent in both biological neural networks and deep
learning systems, have not been fully utilized as a compression technique in
deep learning. Moreover, the interaction between sparse activations and weight
pruning is not fully understood. In this work, we demonstrate that activity
sparsity can compose multiplicatively with parameter sparsity in a recurrent
neural network model based on the GRU that is designed to be activity sparse.
We achieve up to reduction of computation while maintaining
perplexities below on the Penn Treebank language modeling task. This
magnitude of reduction has not been achieved previously with solely sparsely
connected LSTMs, and the language modeling performance of our model has not
been achieved previously with any sparsely activated recurrent neural networks
or spiking neural networks. Neuromorphic computing devices are especially good
at taking advantage of the dynamic activity sparsity, and our results provide
strong evidence that making deep learning models activity sparse and porting
them to neuromorphic devices can be a viable strategy that does not compromise
on task performance. Our results also drive further convergence of methods from
deep learning and neuromorphic computing for efficient machine learning.Comment: Accepted to the First MLNCP Workshop @ NeurIPS 202