316 research outputs found

    Implicit Filter Sparsification In Convolutional Neural Networks

    No full text
    We show implicit filter level sparsity manifests in convolutional neural networks (CNNs) which employ Batch Normalization and ReLU activation, and are trained with adaptive gradient descent techniques and L2 regularization or weight decay. Through an extensive empirical study (Mehta et al., 2019) we hypothesize the mechanism behind the sparsification process, and find surprising links to certain filter sparsification heuristics proposed in literature. Emergence of, and the subsequent pruning of selective features is observed to be one of the contributing mechanisms, leading to feature sparsity at par or better than certain explicit sparsification / pruning approaches. In this workshop article we summarize our findings, and point out corollaries of selective-featurepenalization which could also be employed as heuristics for filter prunin

    Intrinsic Dynamic Shape Prior for Fast, Sequential and Dense Non-Rigid Structure from Motion with Detection of Temporally-Disjoint Rigidity

    No full text
    While dense non-rigid structure from motion (NRSfM) has been extensively studied from the perspective of the reconstructability problem over the recent years, almost no attempts have been undertaken to bring it into the practical realm. The reasons for the slow dissemination are the severe ill-posedness, high sensitivity to motion and deformation cues and the difficulty to obtain reliable point tracks in the vast majority of practical scenarios. To fill this gap, we propose a hybrid approach that extracts prior shape knowledge from an input sequence with NRSfM and uses it as a dynamic shape prior for sequential surface recovery in scenarios with recurrence. Our Dynamic Shape Prior Reconstruction (DSPR) method can be combined with existing dense NRSfM techniques while its energy functional is optimised with stochastic gradient descent at real-time rates for new incoming point tracks. The proposed versatile framework with a new core NRSfM approach outperforms several other methods in the ability to handle inaccurate and noisy point tracks, provided we have access to a representative (in terms of the deformation variety) image sequence. Comprehensive experiments highlight convergence properties and the accuracy of DSPR under different disturbing effects. We also perform a joint study of tracking and reconstruction and show applications to shape compression and heart reconstruction under occlusions. We achieve state-of-the-art metrics (accuracy and compression ratios) in different scenarios

    Live User-guided Intrinsic Video For Static Scenes

    Get PDF
    We present a novel real-time approach for user-guided intrinsic decomposition of static scenes captured by an RGB-D sensor. In the first step, we acquire a three-dimensional representation of the scene using a dense volumetric reconstruction framework. The obtained reconstruction serves as a proxy to densely fuse reflectance estimates and to store user-provided constraints in three-dimensional space. User constraints, in the form of constant shading and reflectance strokes, can be placed directly on the real-world geometry using an intuitive touch-based interaction metaphor, or using interactive mouse strokes. Fusing the decomposition results and constraints in three-dimensional space allows for robust propagation of this information to novel views by re-projection.We leverage this information to improve on the decomposition quality of existing intrinsic video decomposition techniques by further constraining the ill-posed decomposition problem. In addition to improved decomposition quality, we show a variety of live augmented reality applications such as recoloring of objects, relighting of scenes and editing of material appearance

    Tex2Shape: Detailed Full Human Body Geometry From a Single Image

    No full text
    We present a simple yet effective method to infer detailed full human body shape from only a single photograph. Our model can infer full-body shape including face, hair, and clothing including wrinkles at interactive frame-rates. Results feature details even on parts that are occluded in the input image. Our main idea is to turn shape regression into an aligned image-to-image translation problem. The input to our method is a partial texture map of the visible region obtained from off-the-shelf methods. From a partial texture, we estimate detailed normal and vector displacement maps, which can be applied to a low-resolution smooth body model to add detail and clothing. Despite being trained purely with synthetic data, our model generalizes well to real-world photographs. Numerous results demonstrate the versatility and robustness of our method

    A Quantum Computational Approach to Correspondence Problems on Point Sets

    Get PDF
    Modern adiabatic quantum computers (AQC) are already used to solve difficult combinatorial optimisation problems in various domains of science. Currently, only a few applications of AQC in computer vision have been demonstrated. We review modern AQC and derive the first algorithm for transformation estimation and point set alignment suitable for AQC. Our algorithm has a subquadratic computational complexity of state preparation. We perform a systematic experimental analysis of the proposed approach and show several examples of successful point set alignment by simulated sampling. With this paper, we hope to boost the research on AQC for computer vision

    In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations

    No full text
    Convolutional Neural Network based approaches for monocular 3D human pose estimation usually require a large amount of training images with 3D pose annotations. While it is feasible to provide 2D joint annotations for large corpora of in-the-wild images with humans, providing accurate 3D annotations to such in-the-wild corpora is hardly feasible in practice. Most existing 3D labelled data sets are either synthetically created or feature in-studio images. 3D pose estimation algorithms trained on such data often have limited ability to generalize to real world scene diversity. We therefore propose a new deep learning based method for monocular 3D human pose estimation that shows high accuracy and generalizes better to in-the-wild scenes. It has a network architecture that comprises a new disentangled hidden space encoding of explicit 2D and 3D features, and uses supervision by a new learned projection model from predicted 3D pose. Our algorithm can be jointly trained on image data with 3D labels and image data with only 2D labels. It achieves state-of-the-art accuracy on challenging in-the-wild data

    Learning to Reconstruct People in Clothing from a Single RGB Camera

    No full text
    We present a learning-based model to infer the personalized 3D shape of people from a few frames (1-8) of a monocular video in which the person is moving, in less than 10 seconds with a reconstruction accuracy of 5mm. Our model learns to predict the parameters of a statistical body model and instance displacements that add clothing and hair to the shape. The model achieves fast and accurate predictions based on two key design choices. First, by predicting shape in a canonical T-pose space, the network learns to encode the images of the person into pose-invariant latent codes, where the information is fused. Second, based on the observation that feed-forward predictions are fast but do not always align with the input images, we predict using both, bottom-up and top-down streams (one per view) allowing information to flow in both directions. Learning relies only on synthetic 3D data. Once learned, the model can take a variable number of frames as input, and is able to reconstruct shapes even from a single image with an accuracy of 6mm. Results on 3 different datasets demonstrate the efficacy and accuracy of our approach

    MINA: {C}onvex Mixed-Integer Programming for Non-Rigid Shape Alignment

    Get PDF
    We present a convex mixed-integer programming formulation for non-rigid shape matching. To this end, we propose a novel shape deformation model based on an efficient low-dimensional discrete model, so that finding a globally optimal solution is tractable in (most) practical cases. Our approach combines several favourable properties: it is independent of the initialisation, it is much more efficient to solve to global optimality compared to analogous quadratic assignment problem formulations, and it is highly flexible in terms of the variants of matching problems it can handle. Experimentally we demonstrate that our approach outperforms existing methods for sparse shape matching, that it can be used for initialising dense shape matching methods, and we showcase its flexibility on several examples

    Fast Simultaneous Gravitational Alignment of Multiple Point Sets

    Get PDF

    {MINA}: {C}onvex Mixed-Integer Programming for Non-Rigid Shape Alignment

    Get PDF
    corecore