106 research outputs found
Sparsely Aggregated Convolutional Networks
We explore a key architectural aspect of deep convolutional neural networks:
the pattern of internal skip connections used to aggregate outputs of earlier
layers for consumption by deeper layers. Such aggregation is critical to
facilitate training of very deep networks in an end-to-end manner. This is a
primary reason for the widespread adoption of residual networks, which
aggregate outputs via cumulative summation. While subsequent works investigate
alternative aggregation operations (e.g. concatenation), we focus on an
orthogonal question: which outputs to aggregate at a particular point in the
network. We propose a new internal connection structure which aggregates only a
sparse set of previous outputs at any given depth. Our experiments demonstrate
this simple design change offers superior performance with fewer parameters and
lower computational requirements. Moreover, we show that sparse aggregation
allows networks to scale more robustly to 1000+ layers, thereby opening future
avenues for training long-running visual processes.Comment: Accepted to ECCV 201
CondenseNet: An Efficient DenseNet using Learned Group Convolutions
Deep neural networks are increasingly used on mobile devices, where
computational resources are limited. In this paper we develop CondenseNet, a
novel network architecture with unprecedented efficiency. It combines dense
connectivity with a novel module called learned group convolution. The dense
connectivity facilitates feature re-use in the network, whereas learned group
convolutions remove connections between layers for which this feature re-use is
superfluous. At test time, our model can be implemented using standard group
convolutions, allowing for efficient computation in practice. Our experiments
show that CondenseNets are far more efficient than state-of-the-art compact
convolutional networks such as MobileNets and ShuffleNets
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Deep Neural Networks (DNNs) are becoming an important tool in modern
computing applications. Accelerating their training is a major challenge and
techniques range from distributed algorithms to low-level circuit design. In
this survey, we describe the problem from a theoretical perspective, followed
by approaches for its parallelization. We present trends in DNN architectures
and the resulting implications on parallelization strategies. We then review
and model the different types of concurrency in DNNs: from the single operator,
through parallelism in network inference and training, to distributed deep
learning. We discuss asynchronous stochastic optimization, distributed system
architectures, communication schemes, and neural architecture search. Based on
those approaches, we extrapolate potential directions for parallelism in deep
learning
- …