5,265 research outputs found
Sampling of pairs in pairwise likelihood estimation for latent variable models with categorical observed variables
Pairwise likelihood is a limited information estimation method that has also been used for estimating the parameters of latent variable and structural equation models. Pairwise likelihood is a special case of composite likelihood methods that uses lower order conditional or marginal log-likelihoods instead of the full log-likelihood. The composite likelihood to be maximized is a weighted sum of marginal or conditional log-likelihoods. Weighting has been proposed for increasing efficiency but the choice of weights is not straightforward in most applications. Furthermore, the importance of leaving out higher order scores to avoid duplicating lower order marginal information has been pointed out. In this paper, we approach the problem of weighting from a sampling perspective. More especially, we propose a sampling method for selecting pairs based on their contribution to the total variance from all pairs. The sampling approach does not aim to increase efficiency but to decrease the estimation time, especially in models with a large number of observed categorical variables. We demonstrate the performance of the proposed methodology using simulated examples and a real application
Scalable Population Synthesis with Deep Generative Modeling
Population synthesis is concerned with the generation of synthetic yet
realistic representations of populations. It is a fundamental problem in the
modeling of transport where the synthetic populations of micro-agents represent
a key input to most agent-based models. In this paper, a new methodological
framework for how to 'grow' pools of micro-agents is presented. The model
framework adopts a deep generative modeling approach from machine learning
based on a Variational Autoencoder (VAE). Compared to the previous population
synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs
sampling and traditional generative models such as Bayesian Networks or Hidden
Markov Models, the proposed method allows fitting the full joint distribution
for high dimensions. The proposed methodology is compared with a conventional
Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary.
It is shown that, while these two methods outperform the VAE in the
low-dimensional case, they both suffer from scalability issues when the number
of modeled attributes increases. It is also shown that the Gibbs sampler
essentially replicates the agents from the original sample when the required
conditional distributions are estimated as frequency tables. In contrast, the
VAE allows addressing the problem of sampling zeros by generating agents that
are virtually different from those in the original data but have similar
statistical properties. The presented approach can support agent-based modeling
at all levels by enabling richer synthetic populations with smaller zones and
more detailed individual characteristics.Comment: 27 pages, 15 figures, 4 table
Models for Paired Comparison Data: A Review with Emphasis on Dependent Data
Thurstonian and Bradley-Terry models are the most commonly applied models in
the analysis of paired comparison data. Since their introduction, numerous
developments have been proposed in different areas. This paper provides an
updated overview of these extensions, including how to account for object- and
subject-specific covariates and how to deal with ordinal paired comparison
data. Special emphasis is given to models for dependent comparisons. Although
these models are more realistic, their use is complicated by numerical
difficulties. We therefore concentrate on implementation issues. In particular,
a pairwise likelihood approach is explored for models for dependent paired
comparison data, and a simulation study is carried out to compare the
performance of maximum pairwise likelihood with other limited information
estimation methods. The methodology is illustrated throughout using a real data
set about university paired comparisons performed by students.Comment: Published in at http://dx.doi.org/10.1214/12-STS396 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Nested Partially-Latent Class Models for Dependent Binary Data; Estimating Disease Etiology
The Pneumonia Etiology Research for Child Health (PERCH) study seeks to use
modern measurement technology to infer the causes of pneumonia for which
gold-standard evidence is unavailable. The paper describes a latent variable
model designed to infer from case-control data the etiology distribution for
the population of cases, and for an individual case given his or her
measurements. We assume each observation is drawn from a mixture model for
which each component represents one cause or disease class. The model addresses
a major limitation of the traditional latent class approach by taking account
of residual dependence among multivariate binary outcome given disease class,
hence reduces estimation bias, retains efficiency and offers more valid
inference. Such "local dependence" on a single subject is induced in the model
by nesting latent subclasses within each disease class. Measurement precision
and covariation can be estimated using the control sample for whom the class is
known. In a Bayesian framework, we use stick-breaking priors on the subclass
indicators for model-averaged inference across different numbers of subclasses.
Assessment of model fit and individual diagnosis are done using posterior
samples drawn by Gibbs sampling. We demonstrate the utility of the method on
simulated and on the motivating PERCH data.Comment: 30 pages with 5 figures and 1 table; 1 appendix with 4 figures and 1
tabl
- …