3 research outputs found
Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion from 3D Geometry
In autonomous driving, monocular sequences contain lots of information.
Monocular depth estimation, camera ego-motion estimation and optical flow
estimation in consecutive frames are high-profile concerns recently. By
analyzing tasks above, pixels in the middle frame are modeled into three parts:
the rigid region, the non-rigid region, and the occluded region. In joint
unsupervised training of depth and pose, we can segment the occluded region
explicitly. The occlusion information is used in unsupervised learning of
depth, pose and optical flow, as the image reconstructed by depth-pose and
optical flow will be invalid in occluded regions. A less-than-mean mask is
designed to further exclude the mismatched pixels interfered with by motion or
illumination change in the training of depth and pose networks. This method is
also used to exclude some trivial mismatched pixels in the training of the
optical flow network. Maximum normalization is proposed for depth smoothness
term to restrain depth degradation in textureless regions. In the occluded
region, as depth and camera motion can provide more reliable motion estimation,
they can be used to instruct unsupervised learning of optical flow. Our
experiments in KITTI dataset demonstrate that the model based on three regions,
full and explicit segmentation of the occlusion region, the rigid region, and
the non-rigid region with corresponding unsupervised losses can improve
performance on three tasks significantly. The source code is available at:
https://github.com/guangmingw/DOPlearning.Comment: Published in: IEEE Transactions on Intelligent Transportation
Systems. DOI: 10.1109/TITS.2020.301041
NccFlow: Unsupervised Learning of Optical Flow With Non-occlusion from Geometry
Optical flow estimation is a fundamental problem of computer vision and has
many applications in the fields of robot learning and autonomous driving. This
paper reveals novel geometric laws of optical flow based on the insight and
detailed definition of non-occlusion. Then, two novel loss functions are
proposed for the unsupervised learning of optical flow based on the geometric
laws of non-occlusion. Specifically, after the occlusion part of the images are
masked, the flowing process of pixels is carefully considered and geometric
constraints are conducted based on the geometric laws of optical flow. First,
neighboring pixels in the first frame will not intersect during the pixel
displacement to the second frame. Secondly, when the cluster containing
adjacent four pixels in the first frame moves to the second frame, no other
pixels will flow into the quadrilateral formed by them. According to the two
geometrical constraints, the optical flow non-intersection loss and the optical
flow non-blocking loss in the non-occlusion regions are proposed. Two loss
functions punish the irregular and inexact optical flows in the non-occlusion
regions. The experiments on datasets demonstrated that the proposed
unsupervised losses of optical flow based on the geometric laws in
non-occlusion regions make the estimated optical flow more refined in detail,
and improve the performance of unsupervised learning of optical flow. In
addition, the experiments training on synthetic data and evaluating on real
data show that the generalization ability of optical flow network is improved
by our proposed unsupervised approach.Comment: 10 pages, 7 figures, under revie
Autonomous Driving with Deep Learning: A Survey of State-of-Art Technologies
Since DARPA Grand Challenges (rural) in 2004/05 and Urban Challenges in 2007,
autonomous driving has been the most active field of AI applications. Almost at
the same time, deep learning has made breakthrough by several pioneers, three
of them (also called fathers of deep learning), Hinton, Bengio and LeCun, won
ACM Turin Award in 2019. This is a survey of autonomous driving technologies
with deep learning methods. We investigate the major fields of self-driving
systems, such as perception, mapping and localization, prediction, planning and
control, simulation, V2X and safety etc. Due to the limited space, we focus the
analysis on several key areas, i.e. 2D and 3D object detection in perception,
depth estimation from cameras, multiple sensor fusion on the data, feature and
task level respectively, behavior modelling and prediction of vehicle driving
and pedestrian trajectories