1 research outputs found

    Pipelined Training with Stale Weights of Deep Convolutional Neural Networks

    No full text
    Existing approaches that partition a convolutional neural network (CNN) onto multiple accelerators and pipeline the training computations avoid or limit the use of stale weights, and they either underutilize accelerators or increase memory footprint. We explore the impact of stale weights on the statistical efficiency and performance in a pipelined backpropagation scheme that maximizes accelerator utilization and keeps memory overhead modest. We use LeNet-5, AlexNet, VGG and ResNet, and show that when pipelining is limited to early layers in a network, training with stale weights converges and results in models with comparable inference accuracies to those resulting from non-pipelined training. However, when pipelining is deeper in the network, inference accuracies drop signicantly. We propose a hybrid training scheme to address this drop. We demonstrate the performance of our pipelined backpropagation on 2 GPUs achieving speedups of up to 1.8X over a 1-GPU baseline, with a small inference accuracy drop.M.A.S
    corecore