3,507 research outputs found
Conditional Density Estimation with Dimensionality Reduction via Squared-Loss Conditional Entropy Minimization
Regression aims at estimating the conditional mean of output given input.
However, regression is not informative enough if the conditional density is
multimodal, heteroscedastic, and asymmetric. In such a case, estimating the
conditional density itself is preferable, but conditional density estimation
(CDE) is challenging in high-dimensional space. A naive approach to coping with
high-dimensionality is to first perform dimensionality reduction (DR) and then
execute CDE. However, such a two-step process does not perform well in practice
because the error incurred in the first DR step can be magnified in the second
CDE step. In this paper, we propose a novel single-shot procedure that performs
CDE and DR simultaneously in an integrated way. Our key idea is to formulate DR
as the problem of minimizing a squared-loss variant of conditional entropy, and
this is solved via CDE. Thus, an additional CDE step is not needed after DR. We
demonstrate the usefulness of the proposed method through extensive experiments
on various datasets including humanoid robot transition and computer art
高次元データ解析のためのシングルステップ次元削減とその強化学習への応用
学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 五十嵐 健夫, 東京大学教授 井元 清哉, 東京大学教授 中川 裕志, 東京大学講師 中山 英樹, 沖縄科学技術大学院大学教授 銅谷 賢治University of Tokyo(東京大学
A Comparative Review of Dimension Reduction Methods in Approximate Bayesian Computation
Approximate Bayesian computation (ABC) methods make use of comparisons
between simulated and observed summary statistics to overcome the problem of
computationally intractable likelihood functions. As the practical
implementation of ABC requires computations based on vectors of summary
statistics, rather than full data sets, a central question is how to derive
low-dimensional summary statistics from the observed data with minimal loss of
information. In this article we provide a comprehensive review and comparison
of the performance of the principal methods of dimension reduction proposed in
the ABC literature. The methods are split into three nonmutually exclusive
classes consisting of best subset selection methods, projection techniques and
regularization. In addition, we introduce two new methods of dimension
reduction. The first is a best subset selection method based on Akaike and
Bayesian information criteria, and the second uses ridge regression as a
regularization procedure. We illustrate the performance of these dimension
reduction techniques through the analysis of three challenging models and data
sets.Comment: Published in at http://dx.doi.org/10.1214/12-STS406 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Scalable Population Synthesis with Deep Generative Modeling
Population synthesis is concerned with the generation of synthetic yet
realistic representations of populations. It is a fundamental problem in the
modeling of transport where the synthetic populations of micro-agents represent
a key input to most agent-based models. In this paper, a new methodological
framework for how to 'grow' pools of micro-agents is presented. The model
framework adopts a deep generative modeling approach from machine learning
based on a Variational Autoencoder (VAE). Compared to the previous population
synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs
sampling and traditional generative models such as Bayesian Networks or Hidden
Markov Models, the proposed method allows fitting the full joint distribution
for high dimensions. The proposed methodology is compared with a conventional
Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary.
It is shown that, while these two methods outperform the VAE in the
low-dimensional case, they both suffer from scalability issues when the number
of modeled attributes increases. It is also shown that the Gibbs sampler
essentially replicates the agents from the original sample when the required
conditional distributions are estimated as frequency tables. In contrast, the
VAE allows addressing the problem of sampling zeros by generating agents that
are virtually different from those in the original data but have similar
statistical properties. The presented approach can support agent-based modeling
at all levels by enabling richer synthetic populations with smaller zones and
more detailed individual characteristics.Comment: 27 pages, 15 figures, 4 table
- …