7,105 research outputs found

    Learning to Navigate the Energy Landscape

    Full text link
    In this paper, we present a novel and efficient architecture for addressing computer vision problems that use `Analysis by Synthesis'. Analysis by synthesis involves the minimization of the reconstruction error which is typically a non-convex function of the latent target variables. State-of-the-art methods adopt a hybrid scheme where discriminatively trained predictors like Random Forests or Convolutional Neural Networks are used to initialize local search algorithms. While these methods have been shown to produce promising results, they often get stuck in local optima. Our method goes beyond the conventional hybrid architecture by not only proposing multiple accurate initial solutions but by also defining a navigational structure over the solution space that can be used for extremely efficient gradient-free local search. We demonstrate the efficacy of our approach on the challenging problem of RGB Camera Relocalization. To make the RGB camera relocalization problem particularly challenging, we introduce a new dataset of 3D environments which are significantly larger than those found in other publicly-available datasets. Our experiments reveal that the proposed method is able to achieve state-of-the-art camera relocalization results. We also demonstrate the generalizability of our approach on Hand Pose Estimation and Image Retrieval tasks

    Perceptually Motivated Shape Context Which Uses Shape Interiors

    Full text link
    In this paper, we identify some of the limitations of current-day shape matching techniques. We provide examples of how contour-based shape matching techniques cannot provide a good match for certain visually similar shapes. To overcome this limitation, we propose a perceptually motivated variant of the well-known shape context descriptor. We identify that the interior properties of the shape play an important role in object recognition and develop a descriptor that captures these interior properties. We show that our method can easily be augmented with any other shape matching algorithm. We also show from our experiments that the use of our descriptor can significantly improve the retrieval rates

    End-to-End Photo-Sketch Generation via Fully Convolutional Representation Learning

    Full text link
    Sketch-based face recognition is an interesting task in vision and multimedia research, yet it is quite challenging due to the great difference between face photos and sketches. In this paper, we propose a novel approach for photo-sketch generation, aiming to automatically transform face photos into detail-preserving personal sketches. Unlike the traditional models synthesizing sketches based on a dictionary of exemplars, we develop a fully convolutional network to learn the end-to-end photo-sketch mapping. Our approach takes whole face photos as inputs and directly generates the corresponding sketch images with efficient inference and learning, in which the architecture are stacked by only convolutional kernels of very small sizes. To well capture the person identity during the photo-sketch transformation, we define our optimization objective in the form of joint generative-discriminative minimization. In particular, a discriminative regularization term is incorporated into the photo-sketch generation, enhancing the discriminability of the generated person sketches against other individuals. Extensive experiments on several standard benchmarks suggest that our approach outperforms other state-of-the-art methods in both photo-sketch generation and face sketch verification.Comment: 8 pages, 6 figures. Proceeding in ACM International Conference on Multimedia Retrieval (ICMR), 201

    Regularity scalable image coding based on wavelet singularity detection

    Get PDF
    In this paper, we propose an adaptive algorithm for scalable wavelet image coding, which is based on the general feature, the regularity, of images. In pattern recognition or computer vision, regularity of images is estimated from the oriented wavelet coefficients and quantified by the Lipschitz exponents. To estimate the Lipschitz exponents, evaluating the interscale evolution of the wavelet transform modulus sum (WTMS) over the directional cone of influence was proven to be a better approach than tracing the wavelet transform modulus maxima (WTMM). This is because the irregular sampling nature of the WTMM complicates the reconstruction process. Moreover, examples were found to show that the WTMM representation cannot uniquely characterize a signal. It implies that the reconstruction of signal from its WTMM may not be consistently stable. Furthermore, the WTMM approach requires much more computational effort. Therefore, we use the WTMS approach to estimate the regularity of images from the separable wavelet transformed coefficients. Since we do not concern about the localization issue, we allow the decimation to occur when we evaluate the interscale evolution. After the regularity is estimated, this information is utilized in our proposed adaptive regularity scalable wavelet image coding algorithm. This algorithm can be simply embedded into any wavelet image coders, so it is compatible with the existing scalable coding techniques, such as the resolution scalable and signal-to-noise ratio (SNR) scalable coding techniques, without changing the bitstream format, but provides more scalable levels with higher peak signal-to-noise ratios (PSNRs) and lower bit rates. In comparison to the other feature-based wavelet scalable coding algorithms, the proposed algorithm outperforms them in terms of visual perception, computational complexity and coding efficienc

    Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation

    Full text link
    Remote sensing (RS) image retrieval is of great significant for geological information mining. Over the past two decades, a large amount of research on this task has been carried out, which mainly focuses on the following three core issues: feature extraction, similarity metric and relevance feedback. Due to the complexity and multiformity of ground objects in high-resolution remote sensing (HRRS) images, there is still room for improvement in the current retrieval approaches. In this paper, we analyze the three core issues of RS image retrieval and provide a comprehensive review on existing methods. Furthermore, for the goal to advance the state-of-the-art in HRRS image retrieval, we focus on the feature extraction issue and delve how to use powerful deep representations to address this task. We conduct systematic investigation on evaluating correlative factors that may affect the performance of deep features. By optimizing each factor, we acquire remarkable retrieval results on publicly available HRRS datasets. Finally, we explain the experimental phenomenon in detail and draw conclusions according to our analysis. Our work can serve as a guiding role for the research of content-based RS image retrieval
    • ā€¦
    corecore