The sizes of deep neural networks (DNNs) are rapidly outgrowing the capacity
of hardware to store and train them. Research over the past few decades has
explored the prospect of sparsifying DNNs before, during, and after training by
pruning edges from the underlying topology. The resulting neural network is
known as a sparse neural network. More recent work has demonstrated the
remarkable result that certain sparse DNNs can train to the same precision as
dense DNNs at lower runtime and storage cost. An intriguing class of these
sparse DNNs is the X-Nets, which are initialized and trained upon a sparse
topology with neither reference to a parent dense DNN nor subsequent pruning.
We present an algorithm that deterministically generates RadiX-Nets: sparse DNN
topologies that, as a whole, are much more diverse than X-Net topologies, while
preserving X-Nets' desired characteristics. We further present a
functional-analytic conjecture based on the longstanding observation that
sparse neural network topologies can attain the same expressive power as dense
counterpartsComment: 7 pages, 8 figures, accepted at IEEE IPDPS 2019 GrAPL workshop. arXiv
admin note: substantial text overlap with arXiv:1809.0524