9,546 research outputs found
Bayesian analysis of multivariate stable distributions using one-dimensional projections
In this paper we take up Bayesian inference in general multivariate stable distributions. We exploit the representation of Matsui and Takemura (2009) for univariate projections, and the representation of the distributions in terms of their spectral measure. We present efficient MCMC schemes to perform the computations when the spectral measure is approximated discretely or, as we propose, by a normal distribution. Appropriate latent variables are introduced to implement MCMC. In relation to the discrete approximation, we propose efficient computational schemes based on the characteristic function
Uncertainty-Aware Principal Component Analysis
We present a technique to perform dimensionality reduction on data that is
subject to uncertainty. Our method is a generalization of traditional principal
component analysis (PCA) to multivariate probability distributions. In
comparison to non-linear methods, linear dimensionality reduction techniques
have the advantage that the characteristics of such probability distributions
remain intact after projection. We derive a representation of the PCA sample
covariance matrix that respects potential uncertainty in each of the inputs,
building the mathematical foundation of our new method: uncertainty-aware PCA.
In addition to the accuracy and performance gained by our approach over
sampling-based strategies, our formulation allows us to perform sensitivity
analysis with regard to the uncertainty in the data. For this, we propose
factor traces as a novel visualization that enables to better understand the
influence of uncertainty on the chosen principal components. We provide
multiple examples of our technique using real-world datasets. As a special
case, we show how to propagate multivariate normal distributions through PCA in
closed form. Furthermore, we discuss extensions and limitations of our
approach
Scalable Population Synthesis with Deep Generative Modeling
Population synthesis is concerned with the generation of synthetic yet
realistic representations of populations. It is a fundamental problem in the
modeling of transport where the synthetic populations of micro-agents represent
a key input to most agent-based models. In this paper, a new methodological
framework for how to 'grow' pools of micro-agents is presented. The model
framework adopts a deep generative modeling approach from machine learning
based on a Variational Autoencoder (VAE). Compared to the previous population
synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs
sampling and traditional generative models such as Bayesian Networks or Hidden
Markov Models, the proposed method allows fitting the full joint distribution
for high dimensions. The proposed methodology is compared with a conventional
Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary.
It is shown that, while these two methods outperform the VAE in the
low-dimensional case, they both suffer from scalability issues when the number
of modeled attributes increases. It is also shown that the Gibbs sampler
essentially replicates the agents from the original sample when the required
conditional distributions are estimated as frequency tables. In contrast, the
VAE allows addressing the problem of sampling zeros by generating agents that
are virtually different from those in the original data but have similar
statistical properties. The presented approach can support agent-based modeling
at all levels by enabling richer synthetic populations with smaller zones and
more detailed individual characteristics.Comment: 27 pages, 15 figures, 4 table
Bayesian model averaging over tree-based dependence structures for multivariate extremes
Describing the complex dependence structure of extreme phenomena is
particularly challenging. To tackle this issue we develop a novel statistical
algorithm that describes extremal dependence taking advantage of the inherent
hierarchical dependence structure of the max-stable nested logistic
distribution and that identifies possible clusters of extreme variables using
reversible jump Markov chain Monte Carlo techniques. Parsimonious
representations are achieved when clusters of extreme variables are found to be
completely independent. Moreover, we significantly decrease the computational
complexity of full likelihood inference by deriving a recursive formula for the
nested logistic model likelihood. The algorithm performance is verified through
extensive simulation experiments which also compare different likelihood
procedures. The new methodology is used to investigate the dependence
relationships between extreme concentration of multiple pollutants in
California and how these pollutants are related to extreme weather conditions.
Overall, we show that our approach allows for the representation of complex
extremal dependence structures and has valid applications in multivariate data
analysis, such as air pollution monitoring, where it can guide policymaking
- …