314 research outputs found
End-to-end Sinkhorn Autoencoder with Noise Generator
In this work, we propose a novel end-to-end sinkhorn autoencoder with noise
generator for efficient data collection simulation. Simulating processes that
aim at collecting experimental data is crucial for multiple real-life
applications, including nuclear medicine, astronomy and high energy physics.
Contemporary methods, such as Monte Carlo algorithms, provide high-fidelity
results at a price of high computational cost. Multiple attempts are taken to
reduce this burden, e.g. using generative approaches based on Generative
Adversarial Networks or Variational Autoencoders. Although such methods are
much faster, they are often unstable in training and do not allow sampling from
an entire data distribution. To address these shortcomings, we introduce a
novel method dubbed end-to-end Sinkhorn Autoencoder, that leverages sinkhorn
algorithm to explicitly align distribution of encoded real data examples and
generated noise. More precisely, we extend autoencoder architecture by adding a
deterministic neural network trained to map noise from a known distribution
onto autoencoder latent space representing data distribution. We optimise the
entire model jointly. Our method outperforms competing approaches on a
challenging dataset of simulation data from Zero Degree Calorimeters of ALICE
experiment in LHC. as well as standard benchmarks, such as MNIST and CelebA
Wasserstein Variational Inference
This paper introduces Wasserstein variational inference, a new form of
approximate Bayesian inference based on optimal transport theory. Wasserstein
variational inference uses a new family of divergences that includes both
f-divergences and the Wasserstein distance as special cases. The gradients of
the Wasserstein variational loss are obtained by backpropagating through the
Sinkhorn iterations. This technique results in a very stable likelihood-free
training method that can be used with implicit distributions and probabilistic
programs. Using the Wasserstein variational inference framework, we introduce
several new forms of autoencoders and test their robustness and performance
against existing variational autoencoding techniques.Comment: 8 pages, 1 figur
Learning Generative Models with Sinkhorn Divergences
The ability to compare two degenerate probability distributions (i.e. two
probability distributions supported on two distinct low-dimensional manifolds
living in a much higher-dimensional space) is a crucial problem arising in the
estimation of generative models for high-dimensional observations such as those
arising in computer vision or natural language. It is known that optimal
transport metrics can represent a cure for this problem, since they were
specifically designed as an alternative to information divergences to handle
such problematic scenarios. Unfortunately, training generative machines using
OT raises formidable computational and statistical challenges, because of (i)
the computational burden of evaluating OT losses, (ii) the instability and lack
of smoothness of these losses, (iii) the difficulty to estimate robustly these
losses and their gradients in high dimension. This paper presents the first
tractable computational method to train large scale generative models using an
optimal transport loss, and tackles these three issues by relying on two key
ideas: (a) entropic smoothing, which turns the original OT loss into one that
can be computed using Sinkhorn fixed point iterations; (b) algorithmic
(automatic) differentiation of these iterations. These two approximations
result in a robust and differentiable approximation of the OT loss with
streamlined GPU execution. Entropic smoothing generates a family of losses
interpolating between Wasserstein (OT) and Maximum Mean Discrepancy (MMD), thus
allowing to find a sweet spot leveraging the geometry of OT and the favorable
high-dimensional sample complexity of MMD which comes with unbiased gradient
estimates. The resulting computational architecture complements nicely standard
deep network generative models by a stack of extra layers implementing the loss
function
Investigating and Improving Latent Density Segmentation Models for Aleatoric Uncertainty Quantification in Medical Imaging
Data uncertainties, such as sensor noise or occlusions, can introduce
irreducible ambiguities in images, which result in varying, yet plausible,
semantic hypotheses. In Machine Learning, this ambiguity is commonly referred
to as aleatoric uncertainty. Latent density models can be utilized to address
this problem in image segmentation. The most popular approach is the
Probabilistic U-Net (PU-Net), which uses latent Normal densities to optimize
the conditional data log-likelihood Evidence Lower Bound. In this work, we
demonstrate that the PU- Net latent space is severely inhomogenous. As a
result, the effectiveness of gradient descent is inhibited and the model
becomes extremely sensitive to the localization of the latent space samples,
resulting in defective predictions. To address this, we present the Sinkhorn
PU-Net (SPU-Net), which uses the Sinkhorn Divergence to promote homogeneity
across all latent dimensions, effectively improving gradient-descent updates
and model robustness. Our results show that by applying this on public datasets
of various clinical segmentation problems, the SPU-Net receives up to 11%
performance gains compared against preceding latent variable models for
probabilistic segmentation on the Hungarian-Matched metric. The results
indicate that by encouraging a homogeneous latent space, one can significantly
improve latent density modeling for medical image segmentation.Comment: 12 pages incl. references, 11 figure
Learning Combinatorial Embedding Networks for Deep Graph Matching
Graph matching refers to finding node correspondence between graphs, such
that the corresponding node and edge's affinity can be maximized. In addition
with its NP-completeness nature, another important challenge is effective
modeling of the node-wise and structure-wise affinity across graphs and the
resulting objective, to guide the matching procedure effectively finding the
true matching against noises. To this end, this paper devises an end-to-end
differentiable deep network pipeline to learn the affinity for graph matching.
It involves a supervised permutation loss regarding with node correspondence to
capture the combinatorial nature for graph matching. Meanwhile deep graph
embedding models are adopted to parameterize both intra-graph and cross-graph
affinity functions, instead of the traditional shallow and simple parametric
forms e.g. a Gaussian kernel. The embedding can also effectively capture the
higher-order structure beyond second-order edges. The permutation loss model is
agnostic to the number of nodes, and the embedding model is shared among nodes
such that the network allows for varying numbers of nodes in graphs for
training and inference. Moreover, our network is class-agnostic with some
generalization capability across different categories. All these features are
welcomed for real-world applications. Experiments show its superiority against
state-of-the-art graph matching learning methods.Comment: ICCV2019 oral. Code available at
https://github.com/Thinklab-SJTU/PCA-G
- …