2 research outputs found
Local Critic Training of Deep Neural Networks
This paper proposes a novel approach to train deep neural networks by
unlocking the layer-wise dependency of backpropagation training. The approach
employs additional modules called local critic networks besides the main
network model to be trained, which are used to obtain error gradients without
complete feedforward and backward propagation processes. We propose a cascaded
learning strategy for these local networks. In addition, the approach is also
useful from multi-model perspectives, including structural optimization of
neural networks, computationally efficient progressive inference, and ensemble
classification for performance improvement. Experimental results show the
effectiveness of the proposed approach and suggest guidelines for determining
appropriate algorithm parameters
Local Critic Training for Model-Parallel Learning of Deep Neural Networks
In this paper, we propose a novel model-parallel learning method, called
local critic training, which trains neural networks using additional modules
called local critic networks. The main network is divided into several layer
groups and each layer group is updated through error gradients estimated by the
corresponding local critic network. We show that the proposed approach
successfully decouples the update process of the layer groups for both
convolutional neural networks (CNNs) and recurrent neural networks (RNNs). In
addition, we demonstrate that the proposed method is guaranteed to converge to
a critical point. We also show that trained networks by the proposed method can
be used for structural optimization. Experimental results show that our method
achieves satisfactory performance, reduces training time greatly, and decreases
memory consumption per machine. Code is available at
https://github.com/hjdw2/Local-critic-training