23,657 research outputs found
Fast, Accurate Thin-Structure Obstacle Detection for Autonomous Mobile Robots
Safety is paramount for mobile robotic platforms such as self-driving cars
and unmanned aerial vehicles. This work is devoted to a task that is
indispensable for safety yet was largely overlooked in the past -- detecting
obstacles that are of very thin structures, such as wires, cables and tree
branches. This is a challenging problem, as thin objects can be problematic for
active sensors such as lidar and sonar and even for stereo cameras. In this
work, we propose to use video sequences for thin obstacle detection. We
represent obstacles with edges in the video frames, and reconstruct them in 3D
using efficient edge-based visual odometry techniques. We provide both a
monocular camera solution and a stereo camera solution. The former incorporates
Inertial Measurement Unit (IMU) data to solve scale ambiguity, while the latter
enjoys a novel, purely vision-based solution. Experiments demonstrated that the
proposed methods are fast and able to detect thin obstacles robustly and
accurately under various conditions.Comment: Appeared at IEEE CVPR 2017 Workshop on Embedded Visio
Plane-Based Optimization of Geometry and Texture for RGB-D Reconstruction of Indoor Scenes
We present a novel approach to reconstruct RGB-D indoor scene with plane
primitives. Our approach takes as input a RGB-D sequence and a dense coarse
mesh reconstructed by some 3D reconstruction method on the sequence, and
generate a lightweight, low-polygonal mesh with clear face textures and sharp
features without losing geometry details from the original scene. To achieve
this, we firstly partition the input mesh with plane primitives, simplify it
into a lightweight mesh next, then optimize plane parameters, camera poses and
texture colors to maximize the photometric consistency across frames, and
finally optimize mesh geometry to maximize consistency between geometry and
planes. Compared to existing planar reconstruction methods which only cover
large planar regions in the scene, our method builds the entire scene by
adaptive planes without losing geometry details and preserves sharp features in
the final mesh. We demonstrate the effectiveness of our approach by applying it
onto several RGB-D scans and comparing it to other state-of-the-art
reconstruction methods.Comment: in International Conference on 3D Vision 2018; Models and Code: see
https://github.com/chaowang15/plane-opt-rgbd. arXiv admin note: text overlap
with arXiv:1905.0885
Multimodal Deep Learning for Robust RGB-D Object Recognition
Robust object recognition is a crucial ingredient of many, if not all,
real-world robotics applications. This paper leverages recent progress on
Convolutional Neural Networks (CNNs) and proposes a novel RGB-D architecture
for object recognition. Our architecture is composed of two separate CNN
processing streams - one for each modality - which are consecutively combined
with a late fusion network. We focus on learning with imperfect sensor data, a
typical problem in real-world robotics tasks. For accurate learning, we
introduce a multi-stage training methodology and two crucial ingredients for
handling depth data with CNNs. The first, an effective encoding of depth
information for CNNs that enables learning without the need for large depth
datasets. The second, a data augmentation scheme for robust learning with depth
images by corrupting them with realistic noise patterns. We present
state-of-the-art results on the RGB-D object dataset and show recognition in
challenging RGB-D real-world noisy settings.Comment: Final version submitted to IROS'2015, results unchanged,
reformulation of some text passages in abstract and introductio
Real-time deep hair matting on mobile devices
Augmented reality is an emerging technology in many application domains.
Among them is the beauty industry, where live virtual try-on of beauty products
is of great importance. In this paper, we address the problem of live hair
color augmentation. To achieve this goal, hair needs to be segmented quickly
and accurately. We show how a modified MobileNet CNN architecture can be used
to segment the hair in real-time. Instead of training this network using large
amounts of accurate segmentation data, which is difficult to obtain, we use
crowd sourced hair segmentation data. While such data is much simpler to
obtain, the segmentations there are noisy and coarse. Despite this, we show how
our system can produce accurate and fine-detailed hair mattes, while running at
over 30 fps on an iPad Pro tablet.Comment: 7 pages, 7 figures, submitted to CRV 201
- …