3 research outputs found
Parallax Motion Effect Generation Through Instance Segmentation And Depth Estimation
Stereo vision is a growing topic in computer vision due to the innumerable
opportunities and applications this technology offers for the development of
modern solutions, such as virtual and augmented reality applications. To
enhance the user's experience in three-dimensional virtual environments, the
motion parallax estimation is a promising technique to achieve this objective.
In this paper, we propose an algorithm for generating parallax motion effects
from a single image, taking advantage of state-of-the-art instance segmentation
and depth estimation approaches. This work also presents a comparison against
such algorithms to investigate the trade-off between efficiency and quality of
the parallax motion effects, taking into consideration a multi-task learning
network capable of estimating instance segmentation and depth estimation at
once. Experimental results and visual quality assessment indicate that the
PyD-Net network (depth estimation) combined with Mask R-CNN or FBNet networks
(instance segmentation) can produce parallax motion effects with good visual
quality.Comment: 2020 IEEE International Conference on Image Processing (ICIP), Abu
Dhabi, United Arab Emirate
Fast Depth Estimation in a Single Image Using Lightweight Efficient Neural Network
Depth estimation is a crucial and fundamental problem in the computer vision field. Conventional methods re-construct scenes using feature points extracted from multiple images; however, these approaches require multiple images and thus are not easily implemented in various real-time applications. Moreover, the special equipment required by hardware-based approaches using 3D sensors is expensive. Therefore, software-based methods for estimating depth from a single image using machine learning or deep learning are emerging as new alternatives. In this paper, we propose an algorithm that generates a depth map in real time using a single image and an optimized lightweight efficient neural network (L-ENet) algorithm instead of physical equipment, such as an infrared sensor or multi-view camera. Because depth values have a continuous nature and can produce locally ambiguous results, pixel-wise prediction with ordinal depth range classification was applied in this study. In addition, in our method various convolution techniques are applied to extract a dense feature map, and the number of parameters is greatly reduced by reducing the network layer. By using the proposed L-ENet algorithm, an accurate depth map can be generated from a single image quickly and, in a comparison with the ground truth, we can produce depth values closer to those of the ground truth with small errors. Experiments confirmed that the proposed L-ENet can achieve a significantly improved estimation performance over the state-of-the-art algorithms in depth estimation based on a single image