3 research outputs found
Robust and efficient inference and learning algorithms for generative models
Generative modelling is a popular paradigm in machine learning due to its natural
ability to describe uncertainty in data and models and for its applications including data
compression (Ho et al., 2020), missing data imputation (Valera et al., 2018), synthetic
data generation (Lin et al., 2020), representation learning (Kingma and Welling, 2014),
robust classification (Li et al., 2019b), and more. For generative models, the task of
finding the distribution of unobserved variables conditioned on observed ones is referred
to as inference. Finding the optimal model that makes the model distribution close to the
data distribution according to some discrepancy measures is called learning. In practice,
existing learning and inference methods can fall short on robustness and efficiency. A
method that is more robust to its hyper-parameters or different types of data can be
more easily adapted to various real-world applications. How efficient a method is in
regard to the size and the dimensionality of data determines at what scale the method
can be applied. This thesis presents four pieces of my original work that improves these
properties in generative models.
First, I introduce two novel Bayesian inference algorithms. One is called coupled
multinomial Hamiltonian Monte Carlo (Xu et al., 2021a); it builds on Heng and Jacob
(2019), which is a recent work in unbiased Markov chain Monte Carlo (MCMC) (Jacob
et al., 2019b) and has been found to sensitive to hyper-parameters and less efficient
compared to normal, biased MCMC. These issues are solved by establishing couplings
to the widely-used multinomial Hamiltonian Monte Carlo, leading to a statistically
more efficient and robust method. The other method is called roulette-based variational
expectation (RAVE; Xu et al., 2019) that applies amortised inference to a model family
called Bayesian non-parametric models, in which the number of parameters are allowed
to grow unbounded as the data gets more complex. Unlike previous sampling-based
methods that are slow or variational inference methods that rely on truncation, RAVE
combines the advantages of both to achieve flexible inference that is also computational
efficient. Second, I introduce two novel learning methods. One is called generative
ratio-matching (Srivastava et al., 2019) which is a learning algorithm that makes deep
generative models based on kernel methods applicable to high-dimensional data. The
key innovation of this method is learning a projection of the data to a lower-dimensional
space in which the density ratio is preserved such that learning can be done in the lowerdimensional
space where kernel methods are effective. The other method is called
Bayesian symbolic physics that combines Bayesian inference and symbolic regression
in the context of naïve physics—the study of how humans understand and learn physics.
Unlike classic generative models for which the structure of the generative process is
predefined or deep generative models where the process is represented by data-hungry
neural networks, Bayesian-symbolic generative processes are defined by functions over
a hypothesis space specified by a context-free grammar. This formulation allows these
models to incorporate domain knowledge in learning, which gives highly-improved
sample efficiency. For all four pieces of work, I provide theoretical analyses and/or
empirical results to validate that the algorithmic advances lead to improvements in
robustness and efficiency for generative models.
Lastly, I summarise my contributions to free and open-source software on generative
modelling. This includes a set of Julia packages that I contributed and are currently
used by the Turing probabilistic programming language (Ge et al., 2018). These packages,
which are highly reusable components for building probabilistic programming
languages, together form a probabilistic programming ecosystem in Julia. An important
package that is primarily developed by me is called ADVANCEDHMC.JL (Xu et al.,
2020), which provides robust and efficient implementations of HMC methods and has
been adopted as the backend of Turing. Importantly, the design of this package allows
an intuitive abstraction to construct HMC samplers similarly to how they are mathematically
defined. The promise of these open-source packages is to make generative
modelling techniques more accessible to domain experts from various backgrounds and
to make relevant research more reproducible to help advance the field