465 research outputs found
EV-IMO: Motion Segmentation Dataset and Learning Pipeline for Event Cameras
We present the first event-based learning approach for motion segmentation in
indoor scenes and the first event-based dataset - EV-IMO - which includes
accurate pixel-wise motion masks, egomotion and ground truth depth. Our
approach is based on an efficient implementation of the SfM learning pipeline
using a low parameter neural network architecture on event data. In addition to
camera egomotion and a dense depth map, the network estimates pixel-wise
independently moving object segmentation and computes per-object 3D
translational velocities for moving objects. We also train a shallow network
with just 40k parameters, which is able to compute depth and egomotion.
Our EV-IMO dataset features 32 minutes of indoor recording with up to 3 fast
moving objects simultaneously in the camera field of view. The objects and the
camera are tracked by the VICON motion capture system. By 3D scanning the room
and the objects, accurate depth map ground truth and pixel-wise object masks
are obtained, which are reliable even in poor lighting conditions and during
fast motion. We then train and evaluate our learning pipeline on EV-IMO and
demonstrate that our approach far surpasses its rivals and is well suited for
scene constrained robotics applications.Comment: 8 pages, 6 figures. Submitted to 2019 IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS 2019
Depth Estimation Using 2D RGB Images
Single image depth estimation is an ill-posed problem. That is, it is not mathematically possible to uniquely estimate the 3rd dimension (or depth) from a single 2D image. Hence, additional constraints need to be incorporated in order to regulate the solution space. As a result, in the first part of this dissertation, the idea of constraining the model for more accurate depth estimation by taking advantage of the similarity between the RGB image and the corresponding depth map at the geometric edges of the 3D scene is explored. Although deep learning based methods are very successful in computer vision and handle noise very well, they suffer from poor generalization when the test and train distributions are not close. While, the geometric methods do not have the generalization problem since they benefit from temporal information in an unsupervised manner. They are sensitive to noise, though. At the same time, explicitly modeling of a dynamic scenes as well as flexible objects in traditional computer vision methods is a big challenge. Considering the advantages and disadvantages of each approach, a hybrid method, which benefits from both, is proposed here by extending traditional geometric models’ abilities to handle flexible and dynamic objects in the scene. This is made possible by relaxing geometric computer vision rules from one motion model for some areas of the scene into one for every pixel in the scene. This enables the model to detect even small, flexible, floating debris in a dynamic scene. However, it makes the optimization under-constrained. To change the optimization from under-constrained to over-constrained while maintaining the model’s flexibility, ”moving object detection loss” and ”synchrony loss” are designed. The algorithm is trained in an unsupervised fashion. The primary results are in no way comparable to the current state of the art. Because the training process is so slow, it is difficult to compare it to the current state of the art. Also, the algorithm lacks stability. In addition, the optical flow model is extremely noisy and naive. At the end, some solutions are suggested to address these issues
DEUX: Active Exploration for Learning Unsupervised Depth Perception
Depth perception models are typically trained on non-interactive datasets
with predefined camera trajectories. However, this often introduces systematic
biases into the learning process correlated to specific camera paths chosen
during data acquisition. In this paper, we investigate the role of how data is
collected for learning depth completion, from a robot navigation perspective,
by leveraging 3D interactive environments. First, we evaluate four depth
completion models trained on data collected using conventional navigation
techniques. Our key insight is that existing exploration paradigms do not
necessarily provide task-specific data points to achieve competent unsupervised
depth completion learning. We then find that data collected with respect to
photometric reconstruction has a direct positive influence on model
performance. As a result, we develop an active, task-informed, depth
uncertainty-based motion planning approach for learning depth completion, which
we call DEpth Uncertainty-guided eXploration (DEUX). Training with data
collected by our approach improves depth completion by an average greater than
18% across four depth completion models compared to existing exploration
methods on the MP3D test set. We show that our approach further improves
zero-shot generalization, while offering new insights into integrating robot
learning-based depth estimation
- …