3,607 research outputs found
Hierarchical Implicit Models and Likelihood-Free Variational Inference
Implicit probabilistic models are a flexible class of models defined by a
simulation process for data. They form the basis for theories which encompass
our understanding of the physical world. Despite this fundamental nature, the
use of implicit models remains limited due to challenges in specifying complex
latent structure in them, and in performing inferences in such models with
large data sets. In this paper, we first introduce hierarchical implicit models
(HIMs). HIMs combine the idea of implicit densities with hierarchical Bayesian
modeling, thereby defining models via simulators of data with rich hidden
structure. Next, we develop likelihood-free variational inference (LFVI), a
scalable variational inference algorithm for HIMs. Key to LFVI is specifying a
variational family that is also implicit. This matches the model's flexibility
and allows for accurate approximation of the posterior. We demonstrate diverse
applications: a large-scale physical simulator for predator-prey populations in
ecology; a Bayesian generative adversarial network for discrete data; and a
deep implicit model for text generation.Comment: Appears in Neural Information Processing Systems, 201
Learning Generative Models with Sinkhorn Divergences
The ability to compare two degenerate probability distributions (i.e. two
probability distributions supported on two distinct low-dimensional manifolds
living in a much higher-dimensional space) is a crucial problem arising in the
estimation of generative models for high-dimensional observations such as those
arising in computer vision or natural language. It is known that optimal
transport metrics can represent a cure for this problem, since they were
specifically designed as an alternative to information divergences to handle
such problematic scenarios. Unfortunately, training generative machines using
OT raises formidable computational and statistical challenges, because of (i)
the computational burden of evaluating OT losses, (ii) the instability and lack
of smoothness of these losses, (iii) the difficulty to estimate robustly these
losses and their gradients in high dimension. This paper presents the first
tractable computational method to train large scale generative models using an
optimal transport loss, and tackles these three issues by relying on two key
ideas: (a) entropic smoothing, which turns the original OT loss into one that
can be computed using Sinkhorn fixed point iterations; (b) algorithmic
(automatic) differentiation of these iterations. These two approximations
result in a robust and differentiable approximation of the OT loss with
streamlined GPU execution. Entropic smoothing generates a family of losses
interpolating between Wasserstein (OT) and Maximum Mean Discrepancy (MMD), thus
allowing to find a sweet spot leveraging the geometry of OT and the favorable
high-dimensional sample complexity of MMD which comes with unbiased gradient
estimates. The resulting computational architecture complements nicely standard
deep network generative models by a stack of extra layers implementing the loss
function
SurfNet: Generating 3D shape surfaces using deep residual networks
3D shape models are naturally parameterized using vertices and faces, \ie,
composed of polygons forming a surface. However, current 3D learning paradigms
for predictive and generative tasks using convolutional neural networks focus
on a voxelized representation of the object. Lifting convolution operators from
the traditional 2D to 3D results in high computational overhead with little
additional benefit as most of the geometry information is contained on the
surface boundary. Here we study the problem of directly generating the 3D shape
surface of rigid and non-rigid shapes using deep convolutional neural networks.
We develop a procedure to create consistent `geometry images' representing the
shape surface of a category of 3D objects. We then use this consistent
representation for category-specific shape surface generation from a parametric
representation or an image by developing novel extensions of deep residual
networks for the task of geometry image generation. Our experiments indicate
that our network learns a meaningful representation of shape surfaces allowing
it to interpolate between shape orientations and poses, invent new shape
surfaces and reconstruct 3D shape surfaces from previously unseen images.Comment: CVPR 2017 pape
Learning Disentangled Representations with Latent Variation Predictability
Latent traversal is a popular approach to visualize the disentangled latent
representations. Given a bunch of variations in a single unit of the latent
representation, it is expected that there is a change in a single factor of
variation of the data while others are fixed. However, this impressive
experimental observation is rarely explicitly encoded in the objective function
of learning disentangled representations. This paper defines the variation
predictability of latent disentangled representations. Given image pairs
generated by latent codes varying in a single dimension, this varied dimension
could be closely correlated with these image pairs if the representation is
well disentangled. Within an adversarial generation process, we encourage
variation predictability by maximizing the mutual information between latent
variations and corresponding image pairs. We further develop an evaluation
metric that does not rely on the ground-truth generative factors to measure the
disentanglement of latent representations. The proposed variation
predictability is a general constraint that is applicable to the VAE and GAN
frameworks for boosting disentanglement of latent representations. Experiments
show that the proposed variation predictability correlates well with existing
ground-truth-required metrics and the proposed algorithm is effective for
disentanglement learning.Comment: 14 pages, ECCV2
Data-Optimized Coronal Field Model: I. Proof of Concept
Deriving the strength and direction of the three-dimensional (3D) magnetic
field in the solar atmosphere is fundamental for understanding its dynamics.
Volume information on the magnetic field mostly relies on coupling 3D
reconstruction methods with photospheric and/or chromospheric surface vector
magnetic fields. Infrared coronal polarimetry could provide additional
information to better constrain magnetic field reconstructions. However,
combining such data with reconstruction methods is challenging, e.g., because
of the optical-thinness of the solar corona and the lack and limitations of
stereoscopic polarimetry. To address these issues, we introduce the
Data-Optimized Coronal Field Model (DOCFM) framework, a model-data fitting
approach that combines a parametrized 3D generative model, e.g., a magnetic
field extrapolation or a magnetohydrodynamic model, with forward modeling of
coronal data. We test it with a parametrized flux rope insertion method and
infrared coronal polarimetry where synthetic observations are created from a
known "ground truth" physical state. We show that this framework allows us to
accurately retrieve the ground truth 3D magnetic field of a set of force-free
field solutions from the flux rope insertion method. In observational studies,
the DOCFM will provide a means to force the solutions derived with different
reconstruction methods to satisfy additional, common, coronal constraints. The
DOCFM framework therefore opens new perspectives for the exploitation of
coronal polarimetry in magnetic field reconstructions and for developing new
techniques to more reliably infer the 3D magnetic fields that trigger solar
flares and coronal mass ejections.Comment: 14 pages, 6 figures; Accepted for publication in Ap
- …