2 research outputs found
PoseConvGRU: A Monocular Approach for Visual Ego-motion Estimation by Learning
While many visual ego-motion algorithm variants have been proposed in the
past decade, learning based ego-motion estimation methods have seen an
increasing attention because of its desirable properties of robustness to image
noise and camera calibration independence. In this work, we propose a
data-driven approach of fully trainable visual ego-motion estimation for a
monocular camera. We use an end-to-end learning approach in allowing the model
to map directly from input image pairs to an estimate of ego-motion
(parameterized as 6-DoF transformation matrices). We introduce a novel
two-module Long-term Recurrent Convolutional Neural Networks called
PoseConvGRU, with an explicit sequence pose estimation loss to achieve this.
The feature-encoding module encodes the short-term motion feature in an image
pair, while the memory-propagating module captures the long-term motion feature
in the consecutive image pairs. The visual memory is implemented with
convolutional gated recurrent units, which allows propagating information over
time. At each time step, two consecutive RGB images are stacked together to
form a 6 channels tensor for module-1 to learn how to extract motion
information and estimate poses. The sequence of output maps is then passed
through a stacked ConvGRU module to generate the relative transformation pose
of each image pair. We also augment the training data by randomly skipping
frames to simulate the velocity variation which results in a better performance
in turning and high-velocity situations. We evaluate the performance of our
proposed approach on the KITTI Visual Odometry benchmark. The experiments show
a competitive performance of the proposed method to the geometric method and
encourage further exploration of learning based methods for the purpose of
estimating camera ego-motion even though geometrical methods demonstrate
promising results.Comment: 33 pages,12 figure
Differential Viewpoints for Ground Terrain Material Recognition
Computational surface modeling that underlies material recognition has
transitioned from reflectance modeling using in-lab controlled radiometric
measurements to image-based representations based on internet-mined single-view
images captured in the scene. We take a middle-ground approach for material
recognition that takes advantage of both rich radiometric cues and flexible
image capture. A key concept is differential angular imaging, where small
angular variations in image capture enables angular-gradient features for an
enhanced appearance representation that improves recognition. We build a
large-scale material database, Ground Terrain in Outdoor Scenes (GTOS)
database, to support ground terrain recognition for applications such as
autonomous driving and robot navigation. The database consists of over 30,000
images covering 40 classes of outdoor ground terrain under varying weather and
lighting conditions. We develop a novel approach for material recognition
called texture-encoded angular network (TEAN) that combines deep encoding
pooling of RGB information and differential angular images for angular-gradient
features to fully leverage this large dataset. With this novel network
architecture, we extract characteristics of materials encoded in the angular
and spatial gradients of their appearance. Our results show that TEAN achieves
recognition performance that surpasses single view performance and standard
(non-differential/large-angle sampling) multiview performance.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI). arXiv admin note: substantial text overlap with arXiv:1612.0237