12,074 research outputs found

    Adversarial Training for Adverse Conditions: Robust Metric Localisation using Appearance Transfer

    Full text link
    We present a method of improving visual place recognition and metric localisation under very strong appear- ance change. We learn an invertable generator that can trans- form the conditions of images, e.g. from day to night, summer to winter etc. This image transforming filter is explicitly designed to aid and abet feature-matching using a new loss based on SURF detector and dense descriptor maps. A network is trained to output synthetic images optimised for feature matching given only an input RGB image, and these generated images are used to localize the robot against a previously built map using traditional sparse matching approaches. We benchmark our results using multiple traversals of the Oxford RobotCar Dataset over a year-long period, using one traversal as a map and the other to localise. We show that this method significantly improves place recognition and localisation under changing and adverse conditions, while reducing the number of mapping runs needed to successfully achieve reliable localisation.Comment: Accepted at ICRA201

    Methods for the automatic alignment of colour histograms

    Get PDF
    Colour provides important information in many image processing tasks such as object identification and tracking. Different images of the same object frequently yield different colour values due to undesired variations in lighting and the camera. In practice, controlling the source of these fluctuations is difficult, uneconomical or even impossible in a particular imaging environment. This thesis is concerned with the question of how to best align the corresponding clusters of colour histograms to reduce or remove the effect of these undesired variations. We introduce feature based histogram alignment (FBHA) algorithms that enable flexible alignment transformations to be applied. The FBHA approach has three steps, 1) feature detection in the colour histograms, 2) feature association and 3) feature alignment. We investigate the choices for these three steps on two colour databases : 1) a structured and labeled database of RGB imagery acquired under controlled camera, lighting and object variation and 2) grey-level video streams from an industrial inspection application. The design and acquisition of the RGB image and grey-level video databases are a key contribution of the thesis. The databases are used to quantitatively compare the FBHA approach against existing methodologies and show it to be effective. FBHA is intended to provide a generic method for aligning colour histograms, it only uses information from the histograms and therefore ignores spatial information in the image. Spatial information and other context sensitive cues are deliberately avoided to maintain the generic nature of the algorithm; by ignoring some of this important information we gain useful insights into the performance limits of a colour alignment algorithm that works from the colour histogram alone, this helps understand the limits of a generic approach to colour alignment

    FML: Face Model Learning from Videos

    Full text link
    Monocular image-based 3D reconstruction of faces is a long-standing problem in computer vision. Since image data is a 2D projection of a 3D face, the resulting depth ambiguity makes the problem ill-posed. Most existing methods rely on data-driven priors that are built from limited 3D face scans. In contrast, we propose multi-frame video-based self-supervised training of a deep network that (i) learns a face identity model both in shape and appearance while (ii) jointly learning to reconstruct 3D faces. Our face model is learned using only corpora of in-the-wild video clips collected from the Internet. This virtually endless source of training data enables learning of a highly general 3D face model. In order to achieve this, we propose a novel multi-frame consistency loss that ensures consistent shape and appearance across multiple frames of a subject's face, thus minimizing depth ambiguity. At test time we can use an arbitrary number of frames, so that we can perform both monocular as well as multi-frame reconstruction.Comment: CVPR 2019 (Oral). Video: https://www.youtube.com/watch?v=SG2BwxCw0lQ, Project Page: https://gvv.mpi-inf.mpg.de/projects/FML19
    • ā€¦
    corecore