12,071 research outputs found

    See the Difference: Direct Pre-Image Reconstruction and Pose Estimation by Differentiating HOG

    Full text link
    The Histogram of Oriented Gradient (HOG) descriptor has led to many advances in computer vision over the last decade and is still part of many state of the art approaches. We realize that the associated feature computation is piecewise differentiable and therefore many pipelines which build on HOG can be made differentiable. This lends to advanced introspection as well as opportunities for end-to-end optimization. We present our implementation of \nablaHOG based on the auto-differentiation toolbox Chumpy and show applications to pre-image visualization and pose estimation which extends the existing differentiable renderer OpenDR pipeline. Both applications improve on the respective state-of-the-art HOG approaches

    Action Recognition in Videos: from Motion Capture Labs to the Web

    Full text link
    This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

    Image Restoration Using Joint Statistical Modeling in Space-Transform Domain

    Full text link
    This paper presents a novel strategy for high-fidelity image restoration by characterizing both local smoothness and nonlocal self-similarity of natural images in a unified statistical manner. The main contributions are three-folds. First, from the perspective of image statistics, a joint statistical modeling (JSM) in an adaptive hybrid space-transform domain is established, which offers a powerful mechanism of combining local smoothness and nonlocal self-similarity simultaneously to ensure a more reliable and robust estimation. Second, a new form of minimization functional for solving image inverse problem is formulated using JSM under regularization-based framework. Finally, in order to make JSM tractable and robust, a new Split-Bregman based algorithm is developed to efficiently solve the above severely underdetermined inverse problem associated with theoretical proof of convergence. Extensive experiments on image inpainting, image deblurring and mixed Gaussian plus salt-and-pepper noise removal applications verify the effectiveness of the proposed algorithm.Comment: 14 pages, 18 figures, 7 Tables, to be published in IEEE Transactions on Circuits System and Video Technology (TCSVT). High resolution pdf version and Code can be found at: http://idm.pku.edu.cn/staff/zhangjian/IRJSM

    From 3D Point Clouds to Pose-Normalised Depth Maps

    Get PDF
    We consider the problem of generating either pairwise-aligned or pose-normalised depth maps from noisy 3D point clouds in a relatively unrestricted poses. Our system is deployed in a 3D face alignment application and consists of the following four stages: (i) data filtering, (ii) nose tip identification and sub-vertex localisation, (iii) computation of the (relative) face orientation, (iv) generation of either a pose aligned or a pose normalised depth map. We generate an implicit radial basis function (RBF) model of the facial surface and this is employed within all four stages of the process. For example, in stage (ii), construction of novel invariant features is based on sampling this RBF over a set of concentric spheres to give a spherically-sampled RBF (SSR) shape histogram. In stage (iii), a second novel descriptor, called an isoradius contour curvature signal, is defined, which allows rotational alignment to be determined using a simple process of 1D correlation. We test our system on both the University of York (UoY) 3D face dataset and the Face Recognition Grand Challenge (FRGC) 3D data. For the more challenging UoY data, our SSR descriptors significantly outperform three variants of spin images, successfully identifying nose vertices at a rate of 99.6%. Nose localisation performance on the higher quality FRGC data, which has only small pose variations, is 99.9%. Our best system successfully normalises the pose of 3D faces at rates of 99.1% (UoY data) and 99.6% (FRGC data)

    Variational Downscaling, Fusion and Assimilation of Hydrometeorological States via Regularized Estimation

    Full text link
    Improved estimation of hydrometeorological states from down-sampled observations and background model forecasts in a noisy environment, has been a subject of growing research in the past decades. Here, we introduce a unified framework that ties together the problems of downscaling, data fusion and data assimilation as ill-posed inverse problems. This framework seeks solutions beyond the classic least squares estimation paradigms by imposing proper regularization, which are constraints consistent with the degree of smoothness and probabilistic structure of the underlying state. We review relevant regularization methods in derivative space and extend classic formulations of the aforementioned problems with particular emphasis on hydrologic and atmospheric applications. Informed by the statistical characteristics of the state variable of interest, the central results of the paper suggest that proper regularization can lead to a more accurate and stable recovery of the true state and hence more skillful forecasts. In particular, using the Tikhonov and Huber regularization in the derivative space, the promise of the proposed framework is demonstrated in static downscaling and fusion of synthetic multi-sensor precipitation data, while a data assimilation numerical experiment is presented using the heat equation in a variational setting
    corecore