50,789 research outputs found
Mask-guided Style Transfer Network for Purifying Real Images
Recently, the progress of learning-by-synthesis has proposed a training model
for synthetic images, which can effectively reduce the cost of human and
material resources. However, due to the different distribution of synthetic
images compared with real images, the desired performance cannot be achieved.
To solve this problem, the previous method learned a model to improve the
realism of the synthetic images. Different from the previous methods, this
paper try to purify real image by extracting discriminative and robust features
to convert outdoor real images to indoor synthetic images. In this paper, we
first introduce the segmentation masks to construct RGB-mask pairs as inputs,
then we design a mask-guided style transfer network to learn style features
separately from the attention and bkgd(background) regions and learn content
features from full and attention region. Moreover, we propose a novel
region-level task-guided loss to restrain the features learnt from style and
content. Experiments were performed using mixed studies (qualitative and
quantitative) methods to demonstrate the possibility of purifying real images
in complex directions. We evaluate the proposed method on various public
datasets, including LPW, COCO and MPIIGaze. Experimental results show that the
proposed method is effective and achieves the state-of-the-art results.Comment: arXiv admin note: substantial text overlap with arXiv:1903.0582
Learning to Extract Motion from Videos in Convolutional Neural Networks
This paper shows how to extract dense optical flow from videos with a
convolutional neural network (CNN). The proposed model constitutes a potential
building block for deeper architectures to allow using motion without resorting
to an external algorithm, \eg for recognition in videos. We derive our network
architecture from signal processing principles to provide desired invariances
to image contrast, phase and texture. We constrain weights within the network
to enforce strict rotation invariance and substantially reduce the number of
parameters to learn. We demonstrate end-to-end training on only 8 sequences of
the Middlebury dataset, orders of magnitude less than competing CNN-based
motion estimation methods, and obtain comparable performance to classical
methods on the Middlebury benchmark. Importantly, our method outputs a
distributed representation of motion that allows representing multiple,
transparent motions, and dynamic textures. Our contributions on network design
and rotation invariance offer insights nonspecific to motion estimation
Learning Pose Estimation for UAV Autonomous Navigation and Landing Using Visual-Inertial Sensor Data
In this work, we propose a robust network-in-the-loop control system for autonomous navigation and landing of an Unmanned-Aerial-Vehicle (UAV). To estimate the UAV’s absolute pose, we develop a deep neural network (DNN) architecture for visual-inertial odometry, which provides a robust alternative to traditional methods. We first evaluate the accuracy of the estimation by comparing the prediction of our model to traditional visual-inertial approaches on the publicly available EuRoC MAV dataset. The results indicate a clear improvement in the accuracy of the pose estimation up to 25% over the baseline. Finally, we integrate the data-driven estimator in the closed-loop flight control system of Airsim, a simulator available as a plugin for Unreal Engine, and we provide simulation results for autonomous navigation and landing
- …