28,052 research outputs found
Are you going to the party: depends, who else is coming? [Learning hidden group dynamics via conditional latent tree models]
Scalable probabilistic modeling and prediction in high dimensional
multivariate time-series is a challenging problem, particularly for systems
with hidden sources of dependence and/or homogeneity. Examples of such problems
include dynamic social networks with co-evolving nodes and edges and dynamic
student learning in online courses. Here, we address these problems through the
discovery of hierarchical latent groups. We introduce a family of Conditional
Latent Tree Models (CLTM), in which tree-structured latent variables
incorporate the unknown groups. The latent tree itself is conditioned on
observed covariates such as seasonality, historical activity, and node
attributes. We propose a statistically efficient framework for learning both
the hierarchical tree structure and the parameters of the CLTM. We demonstrate
competitive performance in multiple real world datasets from different domains.
These include a dataset on students' attempts at answering questions in a
psychology MOOC, Twitter users participating in an emergency management
discussion and interacting with one another, and windsurfers interacting on a
beach in Southern California. In addition, our modeling framework provides
valuable and interpretable information about the hidden group structures and
their effect on the evolution of the time series
Random Forest variable importance with missing data
Random Forests are commonly applied for data prediction and interpretation. The latter purpose is supported by variable importance measures that rate the relevance of predictors. Yet existing measures can not be computed when data contains missing values. Possible solutions are given by imputation methods, complete case analysis and a newly suggested importance measure. However, it is unknown to what extend these approaches are able to provide a reliable estimate of a variables relevance. An extensive simulation study was performed to investigate this property for a variety of missing data generating processes. Findings and recommendations: Complete case analysis should not be applied as it inappropriately penalized variables that were completely observed. The new importance measure is much more capable to reflect decreased information exclusively for variables with missing values and should therefore be used to evaluate actual data situations. By contrast, multiple imputation allows for an estimation of importances one would potentially observe in complete data situations
A Hierarchical Spatio-Temporal Statistical Model Motivated by Glaciology
In this paper, we extend and analyze a Bayesian hierarchical spatio-temporal
model for physical systems. A novelty is to model the discrepancy between the
output of a computer simulator for a physical process and the actual process
values with a multivariate random walk. For computational efficiency, linear
algebra for bandwidth limited matrices is utilized, and first-order emulator
inference allows for the fast emulation of a numerical partial differential
equation (PDE) solver. A test scenario from a physical system motivated by
glaciology is used to examine the speed and accuracy of the computational
methods used, in addition to the viability of modeling assumptions. We conclude
by discussing how the model and associated methodology can be applied in other
physical contexts besides glaciology.Comment: Revision accepted for publication by the Journal of Agricultural,
Biological, and Environmental Statistic
Invariances of random fields paths, with applications in Gaussian Process Regression
We study pathwise invariances of centred random fields that can be controlled
through the covariance. A result involving composition operators is obtained in
second-order settings, and we show that various path properties including
additivity boil down to invariances of the covariance kernel. These results are
extended to a broader class of operators in the Gaussian case, via the Lo\`eve
isometry. Several covariance-driven pathwise invariances are illustrated,
including fields with symmetric paths, centred paths, harmonic paths, or sparse
paths. The proposed approach delivers a number of promising results and
perspectives in Gaussian process regression
- …