48,961 research outputs found
Locally adaptive factor processes for multivariate time series
In modeling multivariate time series, it is important to allow time-varying
smoothness in the mean and covariance process. In particular, there may be
certain time intervals exhibiting rapid changes and others in which changes are
slow. If such time-varying smoothness is not accounted for, one can obtain
misleading inferences and predictions, with over-smoothing across erratic time
intervals and under-smoothing across times exhibiting slow variation. This can
lead to mis-calibration of predictive intervals, which can be substantially too
narrow or wide depending on the time. We propose a locally adaptive factor
process for characterizing multivariate mean-covariance changes in continuous
time, allowing locally varying smoothness in both the mean and covariance
matrix. This process is constructed utilizing latent dictionary functions
evolving in time through nested Gaussian processes and linearly related to the
observed data with a sparse mapping. Using a differential equation
representation, we bypass usual computational bottlenecks in obtaining MCMC and
online algorithms for approximate Bayesian inference. The performance is
assessed in simulations and illustrated in a financial application
Local-Aggregate Modeling for Big-Data via Distributed Optimization: Applications to Neuroimaging
Technological advances have led to a proliferation of structured big data
that have matrix-valued covariates. We are specifically motivated to build
predictive models for multi-subject neuroimaging data based on each subject's
brain imaging scans. This is an ultra-high-dimensional problem that consists of
a matrix of covariates (brain locations by time points) for each subject; few
methods currently exist to fit supervised models directly to this tensor data.
We propose a novel modeling and algorithmic strategy to apply generalized
linear models (GLMs) to this massive tensor data in which one set of variables
is associated with locations. Our method begins by fitting GLMs to each
location separately, and then builds an ensemble by blending information across
locations through regularization with what we term an aggregating penalty. Our
so called, Local-Aggregate Model, can be fit in a completely distributed manner
over the locations using an Alternating Direction Method of Multipliers (ADMM)
strategy, and thus greatly reduces the computational burden. Furthermore, we
propose to select the appropriate model through a novel sequence of faster
algorithmic solutions that is similar to regularization paths. We will
demonstrate both the computational and predictive modeling advantages of our
methods via simulations and an EEG classification problem.Comment: 41 pages, 5 figures and 3 table
Measurement error caused by spatial misalignment in environmental epidemiology
Copyright @ 2009 Gryparis et al - Published by Oxford University Press.In many environmental epidemiology studies, the locations and/or times of exposure measurements and health assessments do not match. In such settings, health effects analyses often use the predictions from an exposure model as a covariate in a regression model. Such exposure predictions contain some measurement error as the predicted values do not equal the true exposures. We provide a framework for spatial measurement error modeling, showing that smoothing induces a Berkson-type measurement error with nondiagonal error structure. From this viewpoint, we review the existing approaches to estimation in a linear regression health model, including direct use of the spatial predictions and exposure simulation, and explore some modified approaches, including Bayesian models and out-of-sample regression calibration, motivated by measurement error principles. We then extend this work to the generalized linear model framework for health outcomes. Based on analytical considerations and simulation results, we compare the performance of all these approaches under several spatial models for exposure. Our comparisons underscore several important points. First, exposure simulation can perform very poorly under certain realistic scenarios. Second, the relative performance of the different methods depends on the nature of the underlying exposure surface. Third, traditional measurement error concepts can help to explain the relative practical performance of the different methods. We apply the methods to data on the association between levels of particulate matter and birth weight in the greater Boston area.This research was supported by NIEHS grants ES012044 (AG, BAC), ES009825 (JS, BAC), ES007142 (CJP), and ES000002 (CJP), and EPA grant R-832416 (JS, BAC)
- …