5,126 research outputs found
Driven to Distraction: Self-Supervised Distractor Learning for Robust Monocular Visual Odometry in Urban Environments
We present a self-supervised approach to ignoring "distractors" in camera
images for the purposes of robustly estimating vehicle motion in cluttered
urban environments. We leverage offline multi-session mapping approaches to
automatically generate a per-pixel ephemerality mask and depth map for each
input image, which we use to train a deep convolutional network. At run-time we
use the predicted ephemerality and depth as an input to a monocular visual
odometry (VO) pipeline, using either sparse features or dense photometric
matching. Our approach yields metric-scale VO using only a single camera and
can recover the correct egomotion even when 90% of the image is obscured by
dynamic, independently moving objects. We evaluate our robust VO methods on
more than 400km of driving from the Oxford RobotCar Dataset and demonstrate
reduced odometry drift and significantly improved egomotion estimation in the
presence of large moving vehicles in urban traffic.Comment: International Conference on Robotics and Automation (ICRA), 2018.
Video summary: http://youtu.be/ebIrBn_nc-
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
Keyframe-based monocular SLAM: design, survey, and future directions
Extensive research in the field of monocular SLAM for the past fifteen years
has yielded workable systems that found their way into various applications in
robotics and augmented reality. Although filter-based monocular SLAM systems
were common at some time, the more efficient keyframe-based solutions are
becoming the de facto methodology for building a monocular SLAM system. The
objective of this paper is threefold: first, the paper serves as a guideline
for people seeking to design their own monocular SLAM according to specific
environmental constraints. Second, it presents a survey that covers the various
keyframe-based monocular SLAM systems in the literature, detailing the
components of their implementation, and critically assessing the specific
strategies made in each proposed solution. Third, the paper provides insight
into the direction of future research in this field, to address the major
limitations still facing monocular SLAM; namely, in the issues of illumination
changes, initialization, highly dynamic motion, poorly textured scenes,
repetitive textures, map maintenance, and failure recovery
Ego-motion and Surrounding Vehicle State Estimation Using a Monocular Camera
Understanding ego-motion and surrounding vehicle state is essential to enable
automated driving and advanced driving assistance technologies. Typical
approaches to solve this problem use fusion of multiple sensors such as LiDAR,
camera, and radar to recognize surrounding vehicle state, including position,
velocity, and orientation. Such sensing modalities are overly complex and
costly for production of personal use vehicles. In this paper, we propose a
novel machine learning method to estimate ego-motion and surrounding vehicle
state using a single monocular camera. Our approach is based on a combination
of three deep neural networks to estimate the 3D vehicle bounding box, depth,
and optical flow from a sequence of images. The main contribution of this paper
is a new framework and algorithm that integrates these three networks in order
to estimate the ego-motion and surrounding vehicle state. To realize more
accurate 3D position estimation, we address ground plane correction in
real-time. The efficacy of the proposed method is demonstrated through
experimental evaluations that compare our results to ground truth data
available from other sensors including Can-Bus and LiDAR
How do neural networks see depth in single images?
Deep neural networks have lead to a breakthrough in depth estimation from
single images. Recent work often focuses on the accuracy of the depth map,
where an evaluation on a publicly available test set such as the KITTI vision
benchmark is often the main result of the article. While such an evaluation
shows how well neural networks can estimate depth, it does not show how they do
this. To the best of our knowledge, no work currently exists that analyzes what
these networks have learned.
In this work we take the MonoDepth network by Godard et al. and investigate
what visual cues it exploits for depth estimation. We find that the network
ignores the apparent size of known obstacles in favor of their vertical
position in the image. Using the vertical position requires the camera pose to
be known; however we find that MonoDepth only partially corrects for changes in
camera pitch and roll and that these influence the estimated depth towards
obstacles. We further show that MonoDepth's use of the vertical image position
allows it to estimate the distance towards arbitrary obstacles, even those not
appearing in the training set, but that it requires a strong edge at the ground
contact point of the object to do so. In future work we will investigate
whether these observations also apply to other neural networks for monocular
depth estimation.Comment: Submitte
- …