2,028 research outputs found
Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning
Deep convolutional neural networks (CNNs) are indispensable to state-of-the-art computer vision algorithms. However, they are still rarely deployed on battery-powered mobile devices, such as smartphones and wearable gadgets, where vision algorithms can enable many revolutionary real-world applications. The key limiting factor is the high energy consumption of CNN processing due to its high computational complexity. While there are many previous efforts that try to reduce the CNN model size or the amount of computation, we find that they do not necessarily result in lower energy consumption. Therefore, these targets do not serve as a good metric for energy cost estimation. To close the gap between CNN design and energy consumption optimization, we propose an energy-aware pruning algorithm for CNNs that directly uses the energy consumption of a CNN to guide the pruning process. The energy estimation methodology uses parameters extrapolated from actual hardware measurements. The proposed layer- by-layer pruning algorithm also prunes more aggressively than previously proposed pruning methods by minimizing the error in the output feature maps instead of the filter weights. For each layer, the weights are first pruned and then locally fine-tuned with aclosed-form least-square solution to quickly restore the accuracy. After all layers are pruned, the entire network is globally fine-tuned using back-propagation. With the proposed pruning method, the energy consumption of AlexNet and GoogLeNet is reduced by 3.7X and 1.6X, respectively, with less than 1% top-5 accuracy loss. We also show that reducing the number of target classes in AlexNet greatly decreases the number of weights, but has a limited impact on energy consumption
SCANN: Synthesis of Compact and Accurate Neural Networks
Deep neural networks (DNNs) have become the driving force behind recent
artificial intelligence (AI) research. An important problem with implementing a
neural network is the design of its architecture. Typically, such an
architecture is obtained manually by exploring its hyperparameter space and
kept fixed during training. This approach is time-consuming and inefficient.
Another issue is that modern neural networks often contain millions of
parameters, whereas many applications and devices require small inference
models. However, efforts to migrate DNNs to such devices typically entail a
significant loss of classification accuracy. To address these challenges, we
propose a two-step neural network synthesis methodology, called DR+SCANN, that
combines two complementary approaches to design compact and accurate DNNs. At
the core of our framework is the SCANN methodology that uses three basic
architecture-changing operations, namely connection growth, neuron growth, and
connection pruning, to synthesize feed-forward architectures with arbitrary
structure. SCANN encapsulates three synthesis methodologies that apply a
repeated grow-and-prune paradigm to three architectural starting points.
DR+SCANN combines the SCANN methodology with dataset dimensionality reduction
to alleviate the curse of dimensionality. We demonstrate the efficacy of SCANN
and DR+SCANN on various image and non-image datasets. We evaluate SCANN on
MNIST and ImageNet benchmarks. In addition, we also evaluate the efficacy of
using dimensionality reduction alongside SCANN (DR+SCANN) on nine small to
medium-size datasets. We also show that our synthesis methodology yields neural
networks that are much better at navigating the accuracy vs. energy efficiency
space. This would enable neural network-based inference even on
Internet-of-Things sensors.Comment: 13 pages, 8 figure
FastDepth: Fast Monocular Depth Estimation on Embedded Systems
Depth sensing is a critical function for robotic tasks such as localization,
mapping and obstacle detection. There has been a significant and growing
interest in depth estimation from a single RGB image, due to the relatively low
cost and size of monocular cameras. However, state-of-the-art single-view depth
estimation algorithms are based on fairly complex deep neural networks that are
too slow for real-time inference on an embedded platform, for instance, mounted
on a micro aerial vehicle. In this paper, we address the problem of fast depth
estimation on embedded systems. We propose an efficient and lightweight
encoder-decoder network architecture and apply network pruning to further
reduce computational complexity and latency. In particular, we focus on the
design of a low-latency decoder. Our methodology demonstrates that it is
possible to achieve similar accuracy as prior work on depth estimation, but at
inference speeds that are an order of magnitude faster. Our proposed network,
FastDepth, runs at 178 fps on an NVIDIA Jetson TX2 GPU and at 27 fps when using
only the TX2 CPU, with active power consumption under 10 W. FastDepth achieves
close to state-of-the-art accuracy on the NYU Depth v2 dataset. To the best of
the authors' knowledge, this paper demonstrates real-time monocular depth
estimation using a deep neural network with the lowest latency and highest
throughput on an embedded platform that can be carried by a micro aerial
vehicle.Comment: Accepted for presentation at ICRA 2019. 8 pages, 6 figures, 7 table
- …