393 research outputs found
Non-blind Image Restoration Based on Convolutional Neural Network
Blind image restoration processors based on convolutional neural network
(CNN) are intensively researched because of their high performance. However,
they are too sensitive to the perturbation of the degradation model. They
easily fail to restore the image whose degradation model is slightly different
from the trained degradation model. In this paper, we propose a non-blind
CNN-based image restoration processor, aiming to be robust against a
perturbation of the degradation model compared to the blind restoration
processor. Experimental comparisons demonstrate that the proposed non-blind
CNN-based image restoration processor can robustly restore images compared to
existing blind CNN-based image restoration processors.Comment: Accepted by IEEE 7th Global Conference on Consumer Electronics, 201
EMR-MSF: Self-Supervised Recurrent Monocular Scene Flow Exploiting Ego-Motion Rigidity
Self-supervised monocular scene flow estimation, aiming to understand both 3D
structures and 3D motions from two temporally consecutive monocular images, has
received increasing attention for its simple and economical sensor setup.
However, the accuracy of current methods suffers from the bottleneck of
less-efficient network architecture and lack of motion rigidity for
regularization. In this paper, we propose a superior model named EMR-MSF by
borrowing the advantages of network architecture design under the scope of
supervised learning. We further impose explicit and robust geometric
constraints with an elaborately constructed ego-motion aggregation module where
a rigidity soft mask is proposed to filter out dynamic regions for stable
ego-motion estimation using static regions. Moreover, we propose a motion
consistency loss along with a mask regularization loss to fully exploit static
regions. Several efficient training strategies are integrated including a
gradient detachment technique and an enhanced view synthesis process for better
performance. Our proposed method outperforms the previous self-supervised works
by a large margin and catches up to the performance of supervised methods. On
the KITTI scene flow benchmark, our approach improves the SF-all metric of the
state-of-the-art self-supervised monocular method by 44% and demonstrates
superior performance across sub-tasks including depth and visual odometry,
amongst other self-supervised single-task or multi-task methods.Comment: To appear at ICCV 202
Automatic Labeled LiDAR Data Generation based on Precise Human Model
Following improvements in deep neural networks, state-of-the-art networks
have been proposed for human recognition using point clouds captured by LiDAR.
However, the performance of these networks strongly depends on the training
data. An issue with collecting training data is labeling. Labeling by humans is
necessary to obtain the ground truth label; however, labeling requires huge
costs. Therefore, we propose an automatic labeled data generation pipeline, for
which we can change any parameters or data generation environments. Our
approach uses a human model named Dhaiba and a background of Miraikan and
consequently generated realistic artificial data. We present 500k+ data
generated by the proposed pipeline. This paper also describes the specification
of the pipeline and data details with evaluations of various approaches.Comment: Accepted at ICRA201
Polarimetric Multi-View Inverse Rendering
A polarization camera has great potential for 3D reconstruction since the
angle of polarization (AoP) and the degree of polarization (DoP) of reflected
light are related to an object's surface normal. In this paper, we propose a
novel 3D reconstruction method called Polarimetric Multi-View Inverse Rendering
(Polarimetric MVIR) that effectively exploits geometric, photometric, and
polarimetric cues extracted from input multi-view color-polarization images. We
first estimate camera poses and an initial 3D model by geometric reconstruction
with a standard structure-from-motion and multi-view stereo pipeline. We then
refine the initial model by optimizing photometric rendering errors and
polarimetric errors using multi-view RGB, AoP, and DoP images, where we propose
a novel polarimetric cost function that enables an effective constraint on the
estimated surface normal of each vertex, while considering four possible
ambiguous azimuth angles revealed from the AoP measurement. The weight for the
polarimetric cost is effectively determined based on the DoP measurement, which
is regarded as the reliability of polarimetric information. Experimental
results using both synthetic and real data demonstrate that our Polarimetric
MVIR can reconstruct a detailed 3D shape without assuming a specific surface
material and lighting condition.Comment: Paper accepted in IEEE Transactions on Pattern Analysis and Machine
Intelligence (2022). arXiv admin note: substantial text overlap with
arXiv:2007.0883
- …