2,028 research outputs found

    Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning

    Get PDF
    Deep convolutional neural networks (CNNs) are indispensable to state-of-the-art computer vision algorithms. However, they are still rarely deployed on battery-powered mobile devices, such as smartphones and wearable gadgets, where vision algorithms can enable many revolutionary real-world applications. The key limiting factor is the high energy consumption of CNN processing due to its high computational complexity. While there are many previous efforts that try to reduce the CNN model size or the amount of computation, we find that they do not necessarily result in lower energy consumption. Therefore, these targets do not serve as a good metric for energy cost estimation. To close the gap between CNN design and energy consumption optimization, we propose an energy-aware pruning algorithm for CNNs that directly uses the energy consumption of a CNN to guide the pruning process. The energy estimation methodology uses parameters extrapolated from actual hardware measurements. The proposed layer- by-layer pruning algorithm also prunes more aggressively than previously proposed pruning methods by minimizing the error in the output feature maps instead of the filter weights. For each layer, the weights are first pruned and then locally fine-tuned with aclosed-form least-square solution to quickly restore the accuracy. After all layers are pruned, the entire network is globally fine-tuned using back-propagation. With the proposed pruning method, the energy consumption of AlexNet and GoogLeNet is reduced by 3.7X and 1.6X, respectively, with less than 1% top-5 accuracy loss. We also show that reducing the number of target classes in AlexNet greatly decreases the number of weights, but has a limited impact on energy consumption

    SCANN: Synthesis of Compact and Accurate Neural Networks

    Full text link
    Deep neural networks (DNNs) have become the driving force behind recent artificial intelligence (AI) research. An important problem with implementing a neural network is the design of its architecture. Typically, such an architecture is obtained manually by exploring its hyperparameter space and kept fixed during training. This approach is time-consuming and inefficient. Another issue is that modern neural networks often contain millions of parameters, whereas many applications and devices require small inference models. However, efforts to migrate DNNs to such devices typically entail a significant loss of classification accuracy. To address these challenges, we propose a two-step neural network synthesis methodology, called DR+SCANN, that combines two complementary approaches to design compact and accurate DNNs. At the core of our framework is the SCANN methodology that uses three basic architecture-changing operations, namely connection growth, neuron growth, and connection pruning, to synthesize feed-forward architectures with arbitrary structure. SCANN encapsulates three synthesis methodologies that apply a repeated grow-and-prune paradigm to three architectural starting points. DR+SCANN combines the SCANN methodology with dataset dimensionality reduction to alleviate the curse of dimensionality. We demonstrate the efficacy of SCANN and DR+SCANN on various image and non-image datasets. We evaluate SCANN on MNIST and ImageNet benchmarks. In addition, we also evaluate the efficacy of using dimensionality reduction alongside SCANN (DR+SCANN) on nine small to medium-size datasets. We also show that our synthesis methodology yields neural networks that are much better at navigating the accuracy vs. energy efficiency space. This would enable neural network-based inference even on Internet-of-Things sensors.Comment: 13 pages, 8 figure

    FastDepth: Fast Monocular Depth Estimation on Embedded Systems

    Full text link
    Depth sensing is a critical function for robotic tasks such as localization, mapping and obstacle detection. There has been a significant and growing interest in depth estimation from a single RGB image, due to the relatively low cost and size of monocular cameras. However, state-of-the-art single-view depth estimation algorithms are based on fairly complex deep neural networks that are too slow for real-time inference on an embedded platform, for instance, mounted on a micro aerial vehicle. In this paper, we address the problem of fast depth estimation on embedded systems. We propose an efficient and lightweight encoder-decoder network architecture and apply network pruning to further reduce computational complexity and latency. In particular, we focus on the design of a low-latency decoder. Our methodology demonstrates that it is possible to achieve similar accuracy as prior work on depth estimation, but at inference speeds that are an order of magnitude faster. Our proposed network, FastDepth, runs at 178 fps on an NVIDIA Jetson TX2 GPU and at 27 fps when using only the TX2 CPU, with active power consumption under 10 W. FastDepth achieves close to state-of-the-art accuracy on the NYU Depth v2 dataset. To the best of the authors' knowledge, this paper demonstrates real-time monocular depth estimation using a deep neural network with the lowest latency and highest throughput on an embedded platform that can be carried by a micro aerial vehicle.Comment: Accepted for presentation at ICRA 2019. 8 pages, 6 figures, 7 table
    • …
    corecore