561 research outputs found
Depth estimation from monocular images
This work will focus on studying different deep learning architectures for obtaining depth information from monocular RGB images.During this project, state-of-the-art deep learning models have been used to estimate depth
maps from a monocular RGB image applying a teacher-student learning approach.
This paradigm has been used in order to distillate the knowledge of high capacity deep neural
networks into shallower ones to make inference faster for real-time applications.
Some successful applications of this technique can be found both at natural language and
computer vision applications
Self-Attention Dense Depth Estimation Network for Unrectified Video Sequences
The dense depth estimation of a 3D scene has numerous applications, mainly in
robotics and surveillance. LiDAR and radar sensors are the hardware solution
for real-time depth estimation, but these sensors produce sparse depth maps and
are sometimes unreliable. In recent years research aimed at tackling depth
estimation using single 2D image has received a lot of attention. The deep
learning based self-supervised depth estimation methods from the rectified
stereo and monocular video frames have shown promising results. We propose a
self-attention based depth and ego-motion network for unrectified images. We
also introduce non-differentiable distortion of the camera into the training
pipeline. Our approach performs competitively when compared to other
established approaches that used rectified images for depth estimation
Vehicle pose estimation using G-Net: multi-class localization and depth estimation
In this paper we present a new network architecture, called G-Net, for 3D pose estimation on RGB images which is trained in a weakly supervised manner. We introduce a two step pipeline based on region-based Convolutional neural networks (CNNs) for feature localization, bounding box refinement based on non-maximum-suppression and depth estimation. The G-Net is able to estimate the depth from single monocular images with a self-tuned loss function. The combination of this predicted depth and the presented two-step localization allows the extraction of the 3D pose of the object. We show in experiments that our method achieves good results compared to other state-of-the-art approaches which are trained in a fully supervised manner.Peer ReviewedPostprint (author's final draft
- …