10,274 research outputs found
Scalable Population Synthesis with Deep Generative Modeling
Population synthesis is concerned with the generation of synthetic yet
realistic representations of populations. It is a fundamental problem in the
modeling of transport where the synthetic populations of micro-agents represent
a key input to most agent-based models. In this paper, a new methodological
framework for how to 'grow' pools of micro-agents is presented. The model
framework adopts a deep generative modeling approach from machine learning
based on a Variational Autoencoder (VAE). Compared to the previous population
synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs
sampling and traditional generative models such as Bayesian Networks or Hidden
Markov Models, the proposed method allows fitting the full joint distribution
for high dimensions. The proposed methodology is compared with a conventional
Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary.
It is shown that, while these two methods outperform the VAE in the
low-dimensional case, they both suffer from scalability issues when the number
of modeled attributes increases. It is also shown that the Gibbs sampler
essentially replicates the agents from the original sample when the required
conditional distributions are estimated as frequency tables. In contrast, the
VAE allows addressing the problem of sampling zeros by generating agents that
are virtually different from those in the original data but have similar
statistical properties. The presented approach can support agent-based modeling
at all levels by enabling richer synthetic populations with smaller zones and
more detailed individual characteristics.Comment: 27 pages, 15 figures, 4 table
Grammar Variational Autoencoder
Deep generative models have been wildly successful at learning coherent
latent representations for continuous data such as video and audio. However,
generative modeling of discrete data such as arithmetic expressions and
molecular structures still poses significant challenges. Crucially,
state-of-the-art methods often produce outputs that are not valid. We make the
key observation that frequently, discrete data can be represented as a parse
tree from a context-free grammar. We propose a variational autoencoder which
encodes and decodes directly to and from these parse trees, ensuring the
generated outputs are always valid. Surprisingly, we show that not only does
our model more often generate valid outputs, it also learns a more coherent
latent space in which nearby points decode to similar discrete outputs. We
demonstrate the effectiveness of our learned models by showing their improved
performance in Bayesian optimization for symbolic regression and molecular
synthesis
- …