3,715 research outputs found
Comparator Networks
The objective of this work is set-based verification, e.g. to decide if two
sets of images of a face are of the same person or not. The traditional
approach to this problem is to learn to generate a feature vector per image,
aggregate them into one vector to represent the set, and then compute the
cosine similarity between sets. Instead, we design a neural network
architecture that can directly learn set-wise verification. Our contributions
are: (i) We propose a Deep Comparator Network (DCN) that can ingest a pair of
sets (each may contain a variable number of images) as inputs, and compute a
similarity between the pair--this involves attending to multiple discriminative
local regions (landmarks), and comparing local descriptors between pairs of
faces; (ii) To encourage high-quality representations for each set, internal
competition is introduced for recalibration based on the landmark score; (iii)
Inspired by image retrieval, a novel hard sample mining regime is proposed to
control the sampling process, such that the DCN is complementary to the
standard image classification models. Evaluations on the IARPA Janus face
recognition benchmarks show that the comparator networks outperform the
previous state-of-the-art results by a large margin.Comment: To appear in ECCV 201
A comparative evaluation of interest point detectors and local descriptors for visual SLAM
Abstract In this paper we compare the behavior of different interest points detectors and descriptors under the
conditions needed to be used as landmarks in vision-based simultaneous localization and mapping (SLAM).
We evaluate the repeatability of the detectors, as well as the invariance and distinctiveness of the descriptors,
under different perceptual conditions using sequences of images representing planar objects as well as 3D scenes.
We believe that this information will be useful when selecting an appropriat
A Deep Pyramid Deformable Part Model for Face Detection
We present a face detection algorithm based on Deformable Part Models and
deep pyramidal features. The proposed method called DP2MFD is able to detect
faces of various sizes and poses in unconstrained conditions. It reduces the
gap in training and testing of DPM on deep features by adding a normalization
layer to the deep convolutional neural network (CNN). Extensive experiments on
four publicly available unconstrained face detection datasets show that our
method is able to capture the meaningful structure of faces and performs
significantly better than many competitive face detection algorithms
Unsupervised learning of object landmarks by factorized spatial embeddings
Learning automatically the structure of object categories remains an
important open problem in computer vision. In this paper, we propose a novel
unsupervised approach that can discover and learn landmarks in object
categories, thus characterizing their structure. Our approach is based on
factorizing image deformations, as induced by a viewpoint change or an object
deformation, by learning a deep neural network that detects landmarks
consistently with such visual effects. Furthermore, we show that the learned
landmarks establish meaningful correspondences between different object
instances in a category without having to impose this requirement explicitly.
We assess the method qualitatively on a variety of object types, natural and
man-made. We also show that our unsupervised landmarks are highly predictive of
manually-annotated landmarks in face benchmark datasets, and can be used to
regress these with a high degree of accuracy.Comment: To be published in ICCV 201
Local descriptors for visual SLAM
We present a comparison of several local image descriptors in the context of visual
Simultaneous Localization and Mapping (SLAM). In visual SLAM a set of points in the
environment are extracted from images and used as landmarks. The points are represented
by local descriptors used to resolve the association between landmarks. In this paper, we
study the class separability of several descriptors under changes in viewpoint and scale.
Several experiments were carried out using sequences of images in 2D and 3D scenes
- …