2 research outputs found
Perturbation theory approach to study the latent space degeneracy of Variational Autoencoders
The use of Variational Autoencoders in different Machine Learning tasks has
drastically increased in the last years. They have been developed as denoising,
clustering and generative tools, highlighting a large potential in a wide range
of fields. Their embeddings are able to extract relevant information from
highly dimensional inputs, but the converged models can differ significantly
and lead to degeneracy on the latent space. We leverage the relation between
theoretical physics and machine learning to explain this behaviour, and
introduce a new approach to correct for degeneration by using perturbation
theory. The re-formulation of the embedding as multi-dimensional generative
distribution, allows mapping to a new set of functions and their corresponding
energy spectrum. We optimise for a perturbed Hamiltonian, with an additional
energy potential that is related to the unobserved topology of the data. Our
results show the potential of a new theoretical approach that can be used to
interpret the latent space and generative nature of unsupervised learning,
while the energy landscapes defined by the perturbations can be further used
for modelling and dynamical purposes
Max-Affine Spline Insights into Deep Generative Networks
We connect a large class of Generative Deep Networks (GDNs) with spline
operators in order to derive their properties, limitations, and new
opportunities. By characterizing the latent space partition, dimension and
angularity of the generated manifold, we relate the manifold dimension and
approximation error to the sample size. The manifold-per-region affine subspace
defines a local coordinate basis; we provide necessary and sufficient
conditions relating those basis vectors with disentanglement. We also derive
the output probability density mapped onto the generated manifold in terms of
the latent space density, which enables the computation of key statistics such
as its Shannon entropy. This finding also enables the computation of the GDN
likelihood, which provides a new mechanism for model comparison as well as
providing a quality measure for (generated) samples under the learned
distribution. We demonstrate how low entropy and/or multimodal distributions
are not naturally modeled by DGNs and are a cause of training instabilities