7,105 research outputs found
Learning to Navigate the Energy Landscape
In this paper, we present a novel and efficient architecture for addressing
computer vision problems that use `Analysis by Synthesis'. Analysis by
synthesis involves the minimization of the reconstruction error which is
typically a non-convex function of the latent target variables.
State-of-the-art methods adopt a hybrid scheme where discriminatively trained
predictors like Random Forests or Convolutional Neural Networks are used to
initialize local search algorithms. While these methods have been shown to
produce promising results, they often get stuck in local optima. Our method
goes beyond the conventional hybrid architecture by not only proposing multiple
accurate initial solutions but by also defining a navigational structure over
the solution space that can be used for extremely efficient gradient-free local
search. We demonstrate the efficacy of our approach on the challenging problem
of RGB Camera Relocalization. To make the RGB camera relocalization problem
particularly challenging, we introduce a new dataset of 3D environments which
are significantly larger than those found in other publicly-available datasets.
Our experiments reveal that the proposed method is able to achieve
state-of-the-art camera relocalization results. We also demonstrate the
generalizability of our approach on Hand Pose Estimation and Image Retrieval
tasks
Perceptually Motivated Shape Context Which Uses Shape Interiors
In this paper, we identify some of the limitations of current-day shape
matching techniques. We provide examples of how contour-based shape matching
techniques cannot provide a good match for certain visually similar shapes. To
overcome this limitation, we propose a perceptually motivated variant of the
well-known shape context descriptor. We identify that the interior properties
of the shape play an important role in object recognition and develop a
descriptor that captures these interior properties. We show that our method can
easily be augmented with any other shape matching algorithm. We also show from
our experiments that the use of our descriptor can significantly improve the
retrieval rates
End-to-End Photo-Sketch Generation via Fully Convolutional Representation Learning
Sketch-based face recognition is an interesting task in vision and multimedia
research, yet it is quite challenging due to the great difference between face
photos and sketches. In this paper, we propose a novel approach for
photo-sketch generation, aiming to automatically transform face photos into
detail-preserving personal sketches. Unlike the traditional models synthesizing
sketches based on a dictionary of exemplars, we develop a fully convolutional
network to learn the end-to-end photo-sketch mapping. Our approach takes whole
face photos as inputs and directly generates the corresponding sketch images
with efficient inference and learning, in which the architecture are stacked by
only convolutional kernels of very small sizes. To well capture the person
identity during the photo-sketch transformation, we define our optimization
objective in the form of joint generative-discriminative minimization. In
particular, a discriminative regularization term is incorporated into the
photo-sketch generation, enhancing the discriminability of the generated person
sketches against other individuals. Extensive experiments on several standard
benchmarks suggest that our approach outperforms other state-of-the-art methods
in both photo-sketch generation and face sketch verification.Comment: 8 pages, 6 figures. Proceeding in ACM International Conference on
Multimedia Retrieval (ICMR), 201
Regularity scalable image coding based on wavelet singularity detection
In this paper, we propose an adaptive algorithm for scalable wavelet image coding, which is based on the general feature, the regularity, of images. In pattern recognition or computer vision, regularity of images is estimated from the oriented wavelet coefficients and quantified by the Lipschitz exponents. To estimate the Lipschitz exponents, evaluating the interscale evolution of the wavelet transform modulus sum (WTMS) over the directional cone of influence was proven to be a better approach than tracing the wavelet transform modulus maxima (WTMM). This is because the irregular sampling nature of the WTMM complicates the reconstruction process. Moreover, examples were found to show that the WTMM representation cannot uniquely characterize a signal. It implies that the reconstruction of signal from its WTMM may not be consistently stable. Furthermore, the WTMM approach requires much more computational effort. Therefore, we use the WTMS approach to estimate the regularity of images from the separable wavelet transformed coefficients. Since we do not concern about the localization issue, we allow the decimation to occur when we evaluate the interscale evolution. After the regularity is estimated, this information is utilized in our proposed adaptive regularity scalable wavelet image coding algorithm. This algorithm can be simply embedded into any wavelet image coders, so it is compatible with the existing scalable coding techniques, such as the resolution scalable and signal-to-noise ratio (SNR) scalable coding techniques, without changing the bitstream format, but provides more scalable levels with higher peak signal-to-noise ratios (PSNRs) and lower bit rates. In comparison to the other feature-based wavelet scalable coding algorithms, the proposed algorithm outperforms them in terms of visual perception, computational complexity and coding efficienc
Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation
Remote sensing (RS) image retrieval is of great significant for geological
information mining. Over the past two decades, a large amount of research on
this task has been carried out, which mainly focuses on the following three
core issues: feature extraction, similarity metric and relevance feedback. Due
to the complexity and multiformity of ground objects in high-resolution remote
sensing (HRRS) images, there is still room for improvement in the current
retrieval approaches. In this paper, we analyze the three core issues of RS
image retrieval and provide a comprehensive review on existing methods.
Furthermore, for the goal to advance the state-of-the-art in HRRS image
retrieval, we focus on the feature extraction issue and delve how to use
powerful deep representations to address this task. We conduct systematic
investigation on evaluating correlative factors that may affect the performance
of deep features. By optimizing each factor, we acquire remarkable retrieval
results on publicly available HRRS datasets. Finally, we explain the
experimental phenomenon in detail and draw conclusions according to our
analysis. Our work can serve as a guiding role for the research of
content-based RS image retrieval
- ā¦