22,799 research outputs found
The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective Mapping
Many tasks performed by autonomous vehicles such as road marking detection,
object tracking, and path planning are simpler in bird's-eye view. Hence,
Inverse Perspective Mapping (IPM) is often applied to remove the perspective
effect from a vehicle's front-facing camera and to remap its images into a 2D
domain, resulting in a top-down view. Unfortunately, however, this leads to
unnatural blurring and stretching of objects at further distance, due to the
resolution of the camera, limiting applicability. In this paper, we present an
adversarial learning approach for generating a significantly improved IPM from
a single camera image in real time. The generated bird's-eye-view images
contain sharper features (e.g. road markings) and a more homogeneous
illumination, while (dynamic) objects are automatically removed from the scene,
thus revealing the underlying road layout in an improved fashion. We
demonstrate our framework using real-world data from the Oxford RobotCar
Dataset and show that scene understanding tasks directly benefit from our
boosted IPM approach.Comment: equal contribution of first two authors, 8 full pages, 6 figures,
accepted at IV 201
End-to-end 3D face reconstruction with deep neural networks
Monocular 3D facial shape reconstruction from a single 2D facial image has
been an active research area due to its wide applications. Inspired by the
success of deep neural networks (DNN), we propose a DNN-based approach for
End-to-End 3D FAce Reconstruction (UH-E2FAR) from a single 2D image. Different
from recent works that reconstruct and refine the 3D face in an iterative
manner using both an RGB image and an initial 3D facial shape rendering, our
DNN model is end-to-end, and thus the complicated 3D rendering process can be
avoided. Moreover, we integrate in the DNN architecture two components, namely
a multi-task loss function and a fusion convolutional neural network (CNN) to
improve facial expression reconstruction. With the multi-task loss function, 3D
face reconstruction is divided into neutral 3D facial shape reconstruction and
expressive 3D facial shape reconstruction. The neutral 3D facial shape is
class-specific. Therefore, higher layer features are useful. In comparison, the
expressive 3D facial shape favors lower or intermediate layer features. With
the fusion-CNN, features from different intermediate layers are fused and
transformed for predicting the 3D expressive facial shape. Through extensive
experiments, we demonstrate the superiority of our end-to-end framework in
improving the accuracy of 3D face reconstruction.Comment: Accepted to CVPR1
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
- …