11,161 research outputs found
Leveraging Deep Visual Descriptors for Hierarchical Efficient Localization
Many robotics applications require precise pose estimates despite operating
in large and changing environments. This can be addressed by visual
localization, using a pre-computed 3D model of the surroundings. The pose
estimation then amounts to finding correspondences between 2D keypoints in a
query image and 3D points in the model using local descriptors. However,
computational power is often limited on robotic platforms, making this task
challenging in large-scale environments. Binary feature descriptors
significantly speed up this 2D-3D matching, and have become popular in the
robotics community, but also strongly impair the robustness to perceptual
aliasing and changes in viewpoint, illumination and scene structure. In this
work, we propose to leverage recent advances in deep learning to perform an
efficient hierarchical localization. We first localize at the map level using
learned image-wide global descriptors, and subsequently estimate a precise pose
from 2D-3D matches computed in the candidate places only. This restricts the
local search and thus allows to efficiently exploit powerful non-binary
descriptors usually dismissed on resource-constrained devices. Our approach
results in state-of-the-art localization performance while running in real-time
on a popular mobile platform, enabling new prospects for robotics research.Comment: CoRL 2018 Camera-ready (fix typos and update citations
SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis
Synthesizing realistic images from human drawn sketches is a challenging
problem in computer graphics and vision. Existing approaches either need exact
edge maps, or rely on retrieval of existing photographs. In this work, we
propose a novel Generative Adversarial Network (GAN) approach that synthesizes
plausible images from 50 categories including motorcycles, horses and couches.
We demonstrate a data augmentation technique for sketches which is fully
automatic, and we show that the augmented data is helpful to our task. We
introduce a new network building block suitable for both the generator and
discriminator which improves the information flow by injecting the input image
at multiple scales. Compared to state-of-the-art image translation methods, our
approach generates more realistic images and achieves significantly higher
Inception Scores.Comment: Accepted to CVPR 201
KinshipGAN: Synthesizing of Kinship Faces From Family Photos by Regularizing a Deep Face Network
In this paper, we propose a kinship generator network that can synthesize a
possible child face by analyzing his/her parent's photo. For this purpose, we
focus on to handle the scarcity of kinship datasets throughout the paper by
proposing novel solutions in particular. To extract robust features, we
integrate a pre-trained face model to the kinship face generator. Moreover, the
generator network is regularized with an additional face dataset and
adversarial loss to decrease the overfitting of the limited samples. Lastly, we
adapt cycle-domain transformation to attain a more stable results. Experiments
are conducted on Families in the Wild (FIW) dataset. The experimental results
show that the contributions presented in the paper provide important
performance improvements compared to the baseline architecture and our proposed
method yields promising perceptual results.Comment: Accepted to IEEE ICIP 201
Unsupervised Object Discovery and Localization in the Wild: Part-based Matching with Bottom-up Region Proposals
This paper addresses unsupervised discovery and localization of dominant
objects from a noisy image collection with multiple object classes. The setting
of this problem is fully unsupervised, without even image-level annotations or
any assumption of a single dominant class. This is far more general than
typical colocalization, cosegmentation, or weakly-supervised localization
tasks. We tackle the discovery and localization problem using a part-based
region matching approach: We use off-the-shelf region proposals to form a set
of candidate bounding boxes for objects and object parts. These regions are
efficiently matched across images using a probabilistic Hough transform that
evaluates the confidence for each candidate correspondence considering both
appearance and spatial consistency. Dominant objects are discovered and
localized by comparing the scores of candidate regions and selecting those that
stand out over other regions containing them. Extensive experimental
evaluations on standard benchmarks demonstrate that the proposed approach
significantly outperforms the current state of the art in colocalization, and
achieves robust object discovery in challenging mixed-class datasets.Comment: CVPR 201
Synthetic-Neuroscore: Using A Neuro-AI Interface for Evaluating Generative Adversarial Networks
Generative adversarial networks (GANs) are increasingly attracting attention
in the computer vision, natural language processing, speech synthesis and
similar domains. Arguably the most striking results have been in the area of
image synthesis. However, evaluating the performance of GANs is still an open
and challenging problem. Existing evaluation metrics primarily measure the
dissimilarity between real and generated images using automated statistical
methods. They often require large sample sizes for evaluation and do not
directly reflect human perception of image quality. In this work, we describe
an evaluation metric we call Neuroscore, for evaluating the performance of
GANs, that more directly reflects psychoperceptual image quality through the
utilization of brain signals. Our results show that Neuroscore has superior
performance to the current evaluation metrics in that: (1) It is more
consistent with human judgment; (2) The evaluation process needs much smaller
numbers of samples; and (3) It is able to rank the quality of images on a per
GAN basis. A convolutional neural network (CNN) based neuro-AI interface is
proposed to predict Neuroscore from GAN-generated images directly without the
need for neural responses. Importantly, we show that including neural responses
during the training phase of the network can significantly improve the
prediction capability of the proposed model. Materials related to this work are
provided at https://github.com/villawang/Neuro-AI-Interface
- …