16,643 research outputs found

    FastDepth: Fast Monocular Depth Estimation on Embedded Systems

    Full text link
    Depth sensing is a critical function for robotic tasks such as localization, mapping and obstacle detection. There has been a significant and growing interest in depth estimation from a single RGB image, due to the relatively low cost and size of monocular cameras. However, state-of-the-art single-view depth estimation algorithms are based on fairly complex deep neural networks that are too slow for real-time inference on an embedded platform, for instance, mounted on a micro aerial vehicle. In this paper, we address the problem of fast depth estimation on embedded systems. We propose an efficient and lightweight encoder-decoder network architecture and apply network pruning to further reduce computational complexity and latency. In particular, we focus on the design of a low-latency decoder. Our methodology demonstrates that it is possible to achieve similar accuracy as prior work on depth estimation, but at inference speeds that are an order of magnitude faster. Our proposed network, FastDepth, runs at 178 fps on an NVIDIA Jetson TX2 GPU and at 27 fps when using only the TX2 CPU, with active power consumption under 10 W. FastDepth achieves close to state-of-the-art accuracy on the NYU Depth v2 dataset. To the best of the authors' knowledge, this paper demonstrates real-time monocular depth estimation using a deep neural network with the lowest latency and highest throughput on an embedded platform that can be carried by a micro aerial vehicle.Comment: Accepted for presentation at ICRA 2019. 8 pages, 6 figures, 7 table

    Towards Fast-Convergence, Low-Delay and Low-Complexity Network Optimization

    Full text link
    Distributed network optimization has been studied for well over a decade. However, we still do not have a good idea of how to design schemes that can simultaneously provide good performance across the dimensions of utility optimality, convergence speed, and delay. To address these challenges, in this paper, we propose a new algorithmic framework with all these metrics approaching optimality. The salient features of our new algorithm are three-fold: (i) fast convergence: it converges with only O(log⁑(1/ϡ))O(\log(1/\epsilon)) iterations that is the fastest speed among all the existing algorithms; (ii) low delay: it guarantees optimal utility with finite queue length; (iii) simple implementation: the control variables of this algorithm are based on virtual queues that do not require maintaining per-flow information. The new technique builds on a kind of inexact Uzawa method in the Alternating Directional Method of Multiplier, and provides a new theoretical path to prove global and linear convergence rate of such a method without requiring the full rank assumption of the constraint matrix

    ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

    Full text link
    We introduce an extremely computation-efficient CNN architecture named ShuffleNet, which is designed specially for mobile devices with very limited computing power (e.g., 10-150 MFLOPs). The new architecture utilizes two new operations, pointwise group convolution and channel shuffle, to greatly reduce computation cost while maintaining accuracy. Experiments on ImageNet classification and MS COCO object detection demonstrate the superior performance of ShuffleNet over other structures, e.g. lower top-1 error (absolute 7.8%) than recent MobileNet on ImageNet classification task, under the computation budget of 40 MFLOPs. On an ARM-based mobile device, ShuffleNet achieves ~13x actual speedup over AlexNet while maintaining comparable accuracy

    Development of an automated aircraft subsystem architecture generation and analysis tool

    Get PDF
    Purpose – The purpose of this paper is to present a new computational framework to address future preliminary design needs for aircraft subsystems. The ability to investigate multiple candidate technologies forming subsystem architectures is enabled with the provision of automated architecture generation, analysis and optimization. Main focus lies with a demonstration of the frameworks workings, as well as the optimizers performance with a typical form of application problem. Design/methodology/approach – The core aspects involve a functional decomposition, coupled with a synergistic mission performance analysis on the aircraft, architecture and component levels. This may be followed by a complete enumeration of architectures, combined with a user defined technology filtering and concept ranking procedure. In addition, a hybrid heuristic optimizer, based on ant systems optimization and a genetic algorithm, is employed to produce optimal architectures in both component composition and design parameters. The optimizer is tested on a generic architecture design problem combined with modified Griewank and parabolic functions for the continuous space. Findings – Insights from the generalized application problem show consistent rediscovery of the optimal architectures with the optimizer, as compared to a full problem enumeration. In addition multi-objective optimization reveals a Pareto front with differences in component composition as well as continuous parameters. Research limitations/implications – This paper demonstrates the frameworks application on a generalized test problem only. Further publication will consider real engineering design problems. Originality/value – The paper addresses the need for future conceptual design methods of complex systems to consider a mixed concept space of both discrete and continuous nature via automated methods
    • …
    corecore