We propose a new method for creating computationally efficient and compact
convolutional neural networks (CNNs) using a novel sparse connection structure
that resembles a tree root. This allows a significant reduction in
computational cost and number of parameters compared to state-of-the-art deep
CNNs, without compromising accuracy, by exploiting the sparsity of inter-layer
filter dependencies. We validate our approach by using it to train more
efficient variants of state-of-the-art CNN architectures, evaluated on the
CIFAR10 and ILSVRC datasets. Our results show similar or higher accuracy than
the baseline architectures with much less computation, as measured by CPU and
GPU timings. For example, for ResNet 50, our model has 40% fewer parameters,
45% fewer floating point operations, and is 31% (12%) faster on a CPU (GPU).
For the deeper ResNet 200 our model has 25% fewer floating point operations and
44% fewer parameters, while maintaining state-of-the-art accuracy. For
GoogLeNet, our model has 7% fewer parameters and is 21% (16%) faster on a CPU
(GPU).Microsoft Research PhD Scholarshi