4,361 research outputs found
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction
Given the recent advances in depth prediction from Convolutional Neural
Networks (CNNs), this paper investigates how predicted depth maps from a deep
neural network can be deployed for accurate and dense monocular reconstruction.
We propose a method where CNN-predicted dense depth maps are naturally fused
together with depth measurements obtained from direct monocular SLAM. Our
fusion scheme privileges depth prediction in image locations where monocular
SLAM approaches tend to fail, e.g. along low-textured regions, and vice-versa.
We demonstrate the use of depth prediction for estimating the absolute scale of
the reconstruction, hence overcoming one of the major limitations of monocular
SLAM. Finally, we propose a framework to efficiently fuse semantic labels,
obtained from a single frame, with dense SLAM, yielding semantically coherent
scene reconstruction from a single view. Evaluation results on two benchmark
datasets show the robustness and accuracy of our approach.Comment: 10 pages, 6 figures, IEEE Computer Society Conference on Computer
Vision and Pattern Recognition (CVPR), Hawaii, USA, June, 2017. The first two
authors contribute equally to this pape
High-Performance and Tunable Stereo Reconstruction
Traditional stereo algorithms have focused their efforts on reconstruction
quality and have largely avoided prioritizing for run time performance. Robots,
on the other hand, require quick maneuverability and effective computation to
observe its immediate environment and perform tasks within it. In this work, we
propose a high-performance and tunable stereo disparity estimation method, with
a peak frame-rate of 120Hz (VGA resolution, on a single CPU-thread), that can
potentially enable robots to quickly reconstruct their immediate surroundings
and maneuver at high-speeds. Our key contribution is a disparity estimation
algorithm that iteratively approximates the scene depth via a piece-wise planar
mesh from stereo imagery, with a fast depth validation step for semi-dense
reconstruction. The mesh is initially seeded with sparsely matched keypoints,
and is recursively tessellated and refined as needed (via a resampling stage),
to provide the desired stereo disparity accuracy. The inherent simplicity and
speed of our approach, with the ability to tune it to a desired reconstruction
quality and runtime performance makes it a compelling solution for applications
in high-speed vehicles.Comment: Accepted to International Conference on Robotics and Automation
(ICRA) 2016; 8 pages, 5 figure
Robust Dense Mapping for Large-Scale Dynamic Environments
We present a stereo-based dense mapping algorithm for large-scale dynamic
urban environments. In contrast to other existing methods, we simultaneously
reconstruct the static background, the moving objects, and the potentially
moving but currently stationary objects separately, which is desirable for
high-level mobile robotic tasks such as path planning in crowded environments.
We use both instance-aware semantic segmentation and sparse scene flow to
classify objects as either background, moving, or potentially moving, thereby
ensuring that the system is able to model objects with the potential to
transition from static to dynamic, such as parked cars. Given camera poses
estimated from visual odometry, both the background and the (potentially)
moving objects are reconstructed separately by fusing the depth maps computed
from the stereo input. In addition to visual odometry, sparse scene flow is
also used to estimate the 3D motions of the detected moving objects, in order
to reconstruct them accurately. A map pruning technique is further developed to
improve reconstruction accuracy and reduce memory consumption, leading to
increased scalability. We evaluate our system thoroughly on the well-known
KITTI dataset. Our system is capable of running on a PC at approximately 2.5Hz,
with the primary bottleneck being the instance-aware semantic segmentation,
which is a limitation we hope to address in future work. The source code is
available from the project website (http://andreibarsan.github.io/dynslam).Comment: Presented at IEEE International Conference on Robotics and Automation
(ICRA), 201
Co-Fusion: Real-time Segmentation, Tracking and Fusion of Multiple Objects
In this paper we introduce Co-Fusion, a dense SLAM system that takes a live
stream of RGB-D images as input and segments the scene into different objects
(using either motion or semantic cues) while simultaneously tracking and
reconstructing their 3D shape in real time. We use a multiple model fitting
approach where each object can move independently from the background and still
be effectively tracked and its shape fused over time using only the information
from pixels associated with that object label. Previous attempts to deal with
dynamic scenes have typically considered moving regions as outliers, and
consequently do not model their shape or track their motion over time. In
contrast, we enable the robot to maintain 3D models for each of the segmented
objects and to improve them over time through fusion. As a result, our system
can enable a robot to maintain a scene description at the object level which
has the potential to allow interactions with its working environment; even in
the case of dynamic scenes.Comment: International Conference on Robotics and Automation (ICRA) 2017,
http://visual.cs.ucl.ac.uk/pubs/cofusion,
https://github.com/martinruenz/co-fusio
- …