12,074 research outputs found
Adversarial Training for Adverse Conditions: Robust Metric Localisation using Appearance Transfer
We present a method of improving visual place recognition and metric
localisation under very strong appear- ance change. We learn an invertable
generator that can trans- form the conditions of images, e.g. from day to
night, summer to winter etc. This image transforming filter is explicitly
designed to aid and abet feature-matching using a new loss based on SURF
detector and dense descriptor maps. A network is trained to output synthetic
images optimised for feature matching given only an input RGB image, and these
generated images are used to localize the robot against a previously built map
using traditional sparse matching approaches. We benchmark our results using
multiple traversals of the Oxford RobotCar Dataset over a year-long period,
using one traversal as a map and the other to localise. We show that this
method significantly improves place recognition and localisation under changing
and adverse conditions, while reducing the number of mapping runs needed to
successfully achieve reliable localisation.Comment: Accepted at ICRA201
Methods for the automatic alignment of colour histograms
Colour provides important information in many image processing tasks such as object identification and
tracking. Different images of the same object frequently yield different colour values due to undesired
variations in lighting and the camera. In practice, controlling the source of these fluctuations is difficult,
uneconomical or even impossible in a particular imaging environment. This thesis is concerned with the
question of how to best align the corresponding clusters of colour histograms to reduce or remove the
effect of these undesired variations.
We introduce feature based histogram alignment (FBHA) algorithms that enable flexible alignment
transformations to be applied. The FBHA approach has three steps, 1) feature detection in the colour
histograms, 2) feature association and 3) feature alignment. We investigate the choices for these three
steps on two colour databases : 1) a structured and labeled database of RGB imagery acquired under controlled
camera, lighting and object variation and 2) grey-level video streams from an industrial inspection
application. The design and acquisition of the RGB image and grey-level video databases are a key contribution
of the thesis. The databases are used to quantitatively compare the FBHA approach against
existing methodologies and show it to be effective. FBHA is intended to provide a generic method for
aligning colour histograms, it only uses information from the histograms and therefore ignores spatial
information in the image. Spatial information and other context sensitive cues are deliberately avoided
to maintain the generic nature of the algorithm; by ignoring some of this important information we gain
useful insights into the performance limits of a colour alignment algorithm that works from the colour
histogram alone, this helps understand the limits of a generic approach to colour alignment
FML: Face Model Learning from Videos
Monocular image-based 3D reconstruction of faces is a long-standing problem
in computer vision. Since image data is a 2D projection of a 3D face, the
resulting depth ambiguity makes the problem ill-posed. Most existing methods
rely on data-driven priors that are built from limited 3D face scans. In
contrast, we propose multi-frame video-based self-supervised training of a deep
network that (i) learns a face identity model both in shape and appearance
while (ii) jointly learning to reconstruct 3D faces. Our face model is learned
using only corpora of in-the-wild video clips collected from the Internet. This
virtually endless source of training data enables learning of a highly general
3D face model. In order to achieve this, we propose a novel multi-frame
consistency loss that ensures consistent shape and appearance across multiple
frames of a subject's face, thus minimizing depth ambiguity. At test time we
can use an arbitrary number of frames, so that we can perform both monocular as
well as multi-frame reconstruction.Comment: CVPR 2019 (Oral). Video: https://www.youtube.com/watch?v=SG2BwxCw0lQ,
Project Page: https://gvv.mpi-inf.mpg.de/projects/FML19
- ā¦