5 research outputs found
Efficient Convolutional Neural Network Training with Direct Feedback Alignment
There were many algorithms to substitute the back-propagation (BP) in the
deep neural network (DNN) training. However, they could not become popular
because their training accuracy and the computational efficiency were worse
than BP. One of them was direct feedback alignment (DFA), but it showed low
training performance especially for the convolutional neural network (CNN). In
this paper, we overcome the limitation of the DFA algorithm by combining with
the conventional BP during the CNN training. To improve the training stability,
we also suggest the feedback weight initialization method by analyzing the
patterns of the fixed random matrices in the DFA. Finally, we propose the new
training algorithm, binary direct feedback alignment (BDFA) to minimize the
computational cost while maintaining the training accuracy compared with the
DFA. In our experiments, we use the CIFAR-10 and CIFAR-100 dataset to simulate
the CNN learning from the scratch and apply the BDFA to the online learning
based object tracking application to examine the training in the small dataset
environment. Our proposed algorithms show better performance than conventional
BP in both two different training tasks especially when the dataset is small.Comment: The paper was submitted to ICLR 201
Accelerated CNN Training Through Gradient Approximation
Training deep convolutional neural networks such as VGG and ResNet by
gradient descent is an expensive exercise requiring specialized hardware such
as GPUs. Recent works have examined the possibility of approximating the
gradient computation while maintaining the same convergence properties. While
promising, the approximations only work on relatively small datasets such as
MNIST. They also fail to achieve real wall-clock speedups due to lack of
efficient GPU implementations of the proposed approximation methods. In this
work, we explore three alternative methods to approximate gradients, with an
efficient GPU kernel implementation for one of them. We achieve wall-clock
speedup with ResNet-20 and VGG-19 on the CIFAR-10 dataset upwards of 7%, with a
minimal loss in validation accuracy.Comment: An abridged version was presented at EMC^2 : Workshop On Energy
Efficient Machine Learning And Cognitive Computing For Embedded Applications
at ISCA 201
Extension of Direct Feedback Alignment to Convolutional and Recurrent Neural Network for Bio-plausible Deep Learning
Throughout this paper, we focus on the improvement of the direct feedback
alignment (DFA) algorithm and extend the usage of the DFA to convolutional and
recurrent neural networks (CNNs and RNNs). Even though the DFA algorithm is
biologically plausible and has a potential of high-speed training, it has not
been considered as the substitute for back-propagation (BP) due to the low
accuracy in the CNN and RNN training. In this work, we propose a new DFA
algorithm for BP-level accurate CNN and RNN training. Firstly, we divide the
network into several modules and apply the DFA algorithm within the module.
Second, the DFA with the sparse backward weight is applied. It comes with a
form of dilated convolution in the CNN case, and in a form of sparse matrix
multiplication in the RNN case. Additionally, the error propagation method of
CNN becomes simpler through the group convolution. Finally, hybrid DFA
increases the accuracy of the CNN and RNN training to the BP-level while taking
advantage of the parallelism and hardware efficiency of the DFA algorithm.Comment: Submitted to WACV202
Low-memory stochastic backpropagation with multi-channel randomized trace estimation
Thanks to the combination of state-of-the-art accelerators and highly
optimized open software frameworks, there has been tremendous progress in the
performance of deep neural networks. While these developments have been
responsible for many breakthroughs, progress towards solving large-scale
problems, such as video encoding and semantic segmentation in 3D, is hampered
because access to on-premise memory is often limited. Instead of relying on
(optimal) checkpointing or invertibility of the network layers -- to recover
the activations during backpropagation -- we propose to approximate the
gradient of convolutional layers in neural networks with a multi-channel
randomized trace estimation technique. Compared to other methods, this approach
is simple, amenable to analyses, and leads to a greatly reduced memory
footprint. Even though the randomized trace estimation introduces stochasticity
during training, we argue that this is of little consequence as long as the
induced errors are of the same order as errors in the gradient due to the use
of stochastic gradient descent. We discuss the performance of networks trained
with stochastic backpropagation and how the error can be controlled while
maximizing memory usage and minimizing computational overhead
Design of Artificial Intelligence Agents for Games using Deep Reinforcement Learning
In order perform a large variety of tasks and to achieve human-level
performance in complex real-world environments, Artificial Intelligence (AI)
Agents must be able to learn from their past experiences and gain both
knowledge and an accurate representation of their environment from raw sensory
inputs. Traditionally, AI agents have suffered from difficulties in using only
sensory inputs to obtain a good representation of their environment and then
mapping this representation to an efficient control policy. Deep reinforcement
learning algorithms have provided a solution to this issue. In this study, the
performance of different conventional and novel deep reinforcement learning
algorithms was analysed. The proposed method utilises two types of algorithms,
one trained with a variant of Q-learning (DQN) and another trained with SARSA
learning (DSN) to assess the feasibility of using direct feedback alignment, a
novel biologically plausible method for back-propagating the error. These novel
agents, alongside two similar agents trained with the conventional
backpropagation algorithm, were tested by using the OpenAI Gym toolkit on
several classic control theory problems and Atari 2600 video games. The results
of this investigation open the way into new, biologically-inspired deep
reinforcement learning algorithms, and their implementation on neuromorphic
hardware.Comment: Dissertation submitted to the University of Sheffield in partial
fulfilment of the requirements for the degree of Master of Engineering. 98
pages, 21 Tables, 58 Figure