53,445 research outputs found
FastDepth: Fast Monocular Depth Estimation on Embedded Systems
Depth sensing is a critical function for robotic tasks such as localization,
mapping and obstacle detection. There has been a significant and growing
interest in depth estimation from a single RGB image, due to the relatively low
cost and size of monocular cameras. However, state-of-the-art single-view depth
estimation algorithms are based on fairly complex deep neural networks that are
too slow for real-time inference on an embedded platform, for instance, mounted
on a micro aerial vehicle. In this paper, we address the problem of fast depth
estimation on embedded systems. We propose an efficient and lightweight
encoder-decoder network architecture and apply network pruning to further
reduce computational complexity and latency. In particular, we focus on the
design of a low-latency decoder. Our methodology demonstrates that it is
possible to achieve similar accuracy as prior work on depth estimation, but at
inference speeds that are an order of magnitude faster. Our proposed network,
FastDepth, runs at 178 fps on an NVIDIA Jetson TX2 GPU and at 27 fps when using
only the TX2 CPU, with active power consumption under 10 W. FastDepth achieves
close to state-of-the-art accuracy on the NYU Depth v2 dataset. To the best of
the authors' knowledge, this paper demonstrates real-time monocular depth
estimation using a deep neural network with the lowest latency and highest
throughput on an embedded platform that can be carried by a micro aerial
vehicle.Comment: Accepted for presentation at ICRA 2019. 8 pages, 6 figures, 7 table
Fast, Accurate Thin-Structure Obstacle Detection for Autonomous Mobile Robots
Safety is paramount for mobile robotic platforms such as self-driving cars
and unmanned aerial vehicles. This work is devoted to a task that is
indispensable for safety yet was largely overlooked in the past -- detecting
obstacles that are of very thin structures, such as wires, cables and tree
branches. This is a challenging problem, as thin objects can be problematic for
active sensors such as lidar and sonar and even for stereo cameras. In this
work, we propose to use video sequences for thin obstacle detection. We
represent obstacles with edges in the video frames, and reconstruct them in 3D
using efficient edge-based visual odometry techniques. We provide both a
monocular camera solution and a stereo camera solution. The former incorporates
Inertial Measurement Unit (IMU) data to solve scale ambiguity, while the latter
enjoys a novel, purely vision-based solution. Experiments demonstrated that the
proposed methods are fast and able to detect thin obstacles robustly and
accurately under various conditions.Comment: Appeared at IEEE CVPR 2017 Workshop on Embedded Visio
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video
Object detection is considered one of the most challenging problems in this
field of computer vision, as it involves the combination of object
classification and object localization within a scene. Recently, deep neural
networks (DNNs) have been demonstrated to achieve superior object detection
performance compared to other approaches, with YOLOv2 (an improved You Only
Look Once model) being one of the state-of-the-art in DNN-based object
detection methods in terms of both speed and accuracy. Although YOLOv2 can
achieve real-time performance on a powerful GPU, it still remains very
challenging for leveraging this approach for real-time object detection in
video on embedded computing devices with limited computational power and
limited memory. In this paper, we propose a new framework called Fast YOLO, a
fast You Only Look Once framework which accelerates YOLOv2 to be able to
perform object detection in video on embedded devices in a real-time manner.
First, we leverage the evolutionary deep intelligence framework to evolve the
YOLOv2 network architecture and produce an optimized architecture (referred to
as O-YOLOv2 here) that has 2.8X fewer parameters with just a ~2% IOU drop. To
further reduce power consumption on embedded devices while maintaining
performance, a motion-adaptive inference method is introduced into the proposed
Fast YOLO framework to reduce the frequency of deep inference with O-YOLOv2
based on temporal motion characteristics. Experimental results show that the
proposed Fast YOLO framework can reduce the number of deep inferences by an
average of 38.13%, and an average speedup of ~3.3X for objection detection in
video compared to the original YOLOv2, leading Fast YOLO to run an average of
~18FPS on a Nvidia Jetson TX1 embedded system
Pushbroom Stereo for High-Speed Navigation in Cluttered Environments
We present a novel stereo vision algorithm that is capable of obstacle
detection on a mobile-CPU processor at 120 frames per second. Our system
performs a subset of standard block-matching stereo processing, searching only
for obstacles at a single depth. By using an onboard IMU and state-estimator,
we can recover the position of obstacles at all other depths, building and
updating a full depth-map at framerate.
Here, we describe both the algorithm and our implementation on a high-speed,
small UAV, flying at over 20 MPH (9 m/s) close to obstacles. The system
requires no external sensing or computation and is, to the best of our
knowledge, the first high-framerate stereo detection system running onboard a
small UAV
- …