5,399 research outputs found
Semantic 3D Reconstruction with Finite Element Bases
We propose a novel framework for the discretisation of multi-label problems
on arbitrary, continuous domains. Our work bridges the gap between general FEM
discretisations, and labeling problems that arise in a variety of computer
vision tasks, including for instance those derived from the generalised Potts
model. Starting from the popular formulation of labeling as a convex relaxation
by functional lifting, we show that FEM discretisation is valid for the most
general case, where the regulariser is anisotropic and non-metric. While our
findings are generic and applicable to different vision problems, we
demonstrate their practical implementation in the context of semantic 3D
reconstruction, where such regularisers have proved particularly beneficial.
The proposed FEM approach leads to a smaller memory footprint as well as faster
computation, and it constitutes a very simple way to enable variable, adaptive
resolution within the same model
Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval
In this paper, we propose a novel deep generative approach to cross-modal
retrieval to learn hash functions in the absence of paired training samples
through the cycle consistency loss. Our proposed approach employs adversarial
training scheme to lean a couple of hash functions enabling translation between
modalities while assuming the underlying semantic relationship. To induce the
hash codes with semantics to the input-output pair, cycle consistency loss is
further proposed upon the adversarial training to strengthen the correlations
between inputs and corresponding outputs. Our approach is generative to learn
hash functions such that the learned hash codes can maximally correlate each
input-output correspondence, meanwhile can also regenerate the inputs so as to
minimize the information loss. The learning to hash embedding is thus performed
to jointly optimize the parameters of the hash functions across modalities as
well as the associated generative models. Extensive experiments on a variety of
large-scale cross-modal data sets demonstrate that our proposed method achieves
better retrieval results than the state-of-the-arts.Comment: To appeared on IEEE Trans. Image Processing. arXiv admin note: text
overlap with arXiv:1703.10593 by other author
Joint Image Reconstruction and Segmentation Using the Potts Model
We propose a new algorithmic approach to the non-smooth and non-convex Potts
problem (also called piecewise-constant Mumford-Shah problem) for inverse
imaging problems. We derive a suitable splitting into specific subproblems that
can all be solved efficiently. Our method does not require a priori knowledge
on the gray levels nor on the number of segments of the reconstruction.
Further, it avoids anisotropic artifacts such as geometric staircasing. We
demonstrate the suitability of our method for joint image reconstruction and
segmentation. We focus on Radon data, where we in particular consider limited
data situations. For instance, our method is able to recover all segments of
the Shepp-Logan phantom from angular views only. We illustrate the
practical applicability on a real PET dataset. As further applications, we
consider spherical Radon data as well as blurred data
Adversarial Semantic Scene Completion from a Single Depth Image
We propose a method to reconstruct, complete and semantically label a 3D
scene from a single input depth image. We improve the accuracy of the regressed
semantic 3D maps by a novel architecture based on adversarial learning. In
particular, we suggest using multiple adversarial loss terms that not only
enforce realistic outputs with respect to the ground truth, but also an
effective embedding of the internal features. This is done by correlating the
latent features of the encoder working on partial 2.5D data with the latent
features extracted from a variational 3D auto-encoder trained to reconstruct
the complete semantic scene. In addition, differently from other approaches
that operate entirely through 3D convolutions, at test time we retain the
original 2.5D structure of the input during downsampling to improve the
effectiveness of the internal representation of our model. We test our approach
on the main benchmark datasets for semantic scene completion to qualitatively
and quantitatively assess the effectiveness of our proposal.Comment: 2018 International Conference on 3D Vision (3DV
First order algorithms in variational image processing
Variational methods in imaging are nowadays developing towards a quite
universal and flexible tool, allowing for highly successful approaches on tasks
like denoising, deblurring, inpainting, segmentation, super-resolution,
disparity, and optical flow estimation. The overall structure of such
approaches is of the form ; where the functional is a data fidelity term also
depending on some input data and measuring the deviation of from such
and is a regularization functional. Moreover is a (often linear)
forward operator modeling the dependence of data on an underlying image, and
is a positive regularization parameter. While is often
smooth and (strictly) convex, the current practice almost exclusively uses
nonsmooth regularization functionals. The majority of successful techniques is
using nonsmooth and convex functionals like the total variation and
generalizations thereof or -norms of coefficients arising from scalar
products with some frame system. The efficient solution of such variational
problems in imaging demands for appropriate algorithms. Taking into account the
specific structure as a sum of two very different terms to be minimized,
splitting algorithms are a quite canonical choice. Consequently this field has
revived the interest in techniques like operator splittings or augmented
Lagrangians. Here we shall provide an overview of methods currently developed
and recent results as well as some computational studies providing a comparison
of different methods and also illustrating their success in applications.Comment: 60 pages, 33 figure
Recommended from our members
Deep learning for cardiac image segmentation: A review
Deep learning has become the most widely used approach for cardiac image segmentation in recent years. In this paper, we provide a review of over 100 cardiac image segmentation papers using deep learning, which covers common imaging modalities including magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound (US) and major anatomical structures of interest (ventricles, atria and vessels). In addition, a summary of publicly available cardiac image datasets and code repositories are included to provide a base for encouraging reproducible research. Finally, we discuss the challenges and limitations with current deep learning-based approaches (scarcity of labels, model generalizability across different domains, interpretability) and suggest potential directions for future research
- …