5,399 research outputs found

    Semantic 3D Reconstruction with Finite Element Bases

    Full text link
    We propose a novel framework for the discretisation of multi-label problems on arbitrary, continuous domains. Our work bridges the gap between general FEM discretisations, and labeling problems that arise in a variety of computer vision tasks, including for instance those derived from the generalised Potts model. Starting from the popular formulation of labeling as a convex relaxation by functional lifting, we show that FEM discretisation is valid for the most general case, where the regulariser is anisotropic and non-metric. While our findings are generic and applicable to different vision problems, we demonstrate their practical implementation in the context of semantic 3D reconstruction, where such regularisers have proved particularly beneficial. The proposed FEM approach leads to a smaller memory footprint as well as faster computation, and it constitutes a very simple way to enable variable, adaptive resolution within the same model

    Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval

    Full text link
    In this paper, we propose a novel deep generative approach to cross-modal retrieval to learn hash functions in the absence of paired training samples through the cycle consistency loss. Our proposed approach employs adversarial training scheme to lean a couple of hash functions enabling translation between modalities while assuming the underlying semantic relationship. To induce the hash codes with semantics to the input-output pair, cycle consistency loss is further proposed upon the adversarial training to strengthen the correlations between inputs and corresponding outputs. Our approach is generative to learn hash functions such that the learned hash codes can maximally correlate each input-output correspondence, meanwhile can also regenerate the inputs so as to minimize the information loss. The learning to hash embedding is thus performed to jointly optimize the parameters of the hash functions across modalities as well as the associated generative models. Extensive experiments on a variety of large-scale cross-modal data sets demonstrate that our proposed method achieves better retrieval results than the state-of-the-arts.Comment: To appeared on IEEE Trans. Image Processing. arXiv admin note: text overlap with arXiv:1703.10593 by other author

    Joint Image Reconstruction and Segmentation Using the Potts Model

    Full text link
    We propose a new algorithmic approach to the non-smooth and non-convex Potts problem (also called piecewise-constant Mumford-Shah problem) for inverse imaging problems. We derive a suitable splitting into specific subproblems that can all be solved efficiently. Our method does not require a priori knowledge on the gray levels nor on the number of segments of the reconstruction. Further, it avoids anisotropic artifacts such as geometric staircasing. We demonstrate the suitability of our method for joint image reconstruction and segmentation. We focus on Radon data, where we in particular consider limited data situations. For instance, our method is able to recover all segments of the Shepp-Logan phantom from 77 angular views only. We illustrate the practical applicability on a real PET dataset. As further applications, we consider spherical Radon data as well as blurred data

    Adversarial Semantic Scene Completion from a Single Depth Image

    Full text link
    We propose a method to reconstruct, complete and semantically label a 3D scene from a single input depth image. We improve the accuracy of the regressed semantic 3D maps by a novel architecture based on adversarial learning. In particular, we suggest using multiple adversarial loss terms that not only enforce realistic outputs with respect to the ground truth, but also an effective embedding of the internal features. This is done by correlating the latent features of the encoder working on partial 2.5D data with the latent features extracted from a variational 3D auto-encoder trained to reconstruct the complete semantic scene. In addition, differently from other approaches that operate entirely through 3D convolutions, at test time we retain the original 2.5D structure of the input during downsampling to improve the effectiveness of the internal representation of our model. We test our approach on the main benchmark datasets for semantic scene completion to qualitatively and quantitatively assess the effectiveness of our proposal.Comment: 2018 International Conference on 3D Vision (3DV

    First order algorithms in variational image processing

    Get PDF
    Variational methods in imaging are nowadays developing towards a quite universal and flexible tool, allowing for highly successful approaches on tasks like denoising, deblurring, inpainting, segmentation, super-resolution, disparity, and optical flow estimation. The overall structure of such approaches is of the form D(Ku)+αR(u)minu{\cal D}(Ku) + \alpha {\cal R} (u) \rightarrow \min_u ; where the functional D{\cal D} is a data fidelity term also depending on some input data ff and measuring the deviation of KuKu from such and R{\cal R} is a regularization functional. Moreover KK is a (often linear) forward operator modeling the dependence of data on an underlying image, and α\alpha is a positive regularization parameter. While D{\cal D} is often smooth and (strictly) convex, the current practice almost exclusively uses nonsmooth regularization functionals. The majority of successful techniques is using nonsmooth and convex functionals like the total variation and generalizations thereof or 1\ell_1-norms of coefficients arising from scalar products with some frame system. The efficient solution of such variational problems in imaging demands for appropriate algorithms. Taking into account the specific structure as a sum of two very different terms to be minimized, splitting algorithms are a quite canonical choice. Consequently this field has revived the interest in techniques like operator splittings or augmented Lagrangians. Here we shall provide an overview of methods currently developed and recent results as well as some computational studies providing a comparison of different methods and also illustrating their success in applications.Comment: 60 pages, 33 figure
    corecore