87,135 research outputs found
Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains
Existing approaches for multivariate functional principal component analysis
are restricted to data on the same one-dimensional interval. The presented
approach focuses on multivariate functional data on different domains that may
differ in dimension, e.g. functions and images. The theoretical basis for
multivariate functional principal component analysis is given in terms of a
Karhunen-Lo\`eve Theorem. For the practically relevant case of a finite
Karhunen-Lo\`eve representation, a relationship between univariate and
multivariate functional principal component analysis is established. This
offers an estimation strategy to calculate multivariate functional principal
components and scores based on their univariate counterparts. For the resulting
estimators, asymptotic results are derived. The approach can be extended to
finite univariate expansions in general, not necessarily orthonormal bases. It
is also applicable for sparse functional data or data with measurement error. A
flexible R-implementation is available on CRAN. The new method is shown to be
competitive to existing approaches for data observed on a common
one-dimensional domain. The motivating application is a neuroimaging study,
where the goal is to explore how longitudinal trajectories of a
neuropsychological test score covary with FDG-PET brain scans at baseline.
Supplementary material, including detailed proofs, additional simulation
results and software is available online.Comment: Revised Version. R-Code for the online appendix is available in the
.zip file associated with this article in subdirectory "/Software". The
software associated with this article is available on CRAN (packages funData
and MFPCA
Functional linear regression analysis for longitudinal data
We propose nonparametric methods for functional linear regression which are
designed for sparse longitudinal data, where both the predictor and response
are functions of a covariate such as time. Predictor and response processes
have smooth random trajectories, and the data consist of a small number of
noisy repeated measurements made at irregular times for a sample of subjects.
In longitudinal studies, the number of repeated measurements per subject is
often small and may be modeled as a discrete random number and, accordingly,
only a finite and asymptotically nonincreasing number of measurements are
available for each subject or experimental unit. We propose a functional
regression approach for this situation, using functional principal component
analysis, where we estimate the functional principal component scores through
conditional expectations. This allows the prediction of an unobserved response
trajectory from sparse measurements of a predictor trajectory. The resulting
technique is flexible and allows for different patterns regarding the timing of
the measurements obtained for predictor and response trajectories. Asymptotic
properties for a sample of subjects are investigated under mild conditions,
as , and we obtain consistent estimation for the regression
function. Besides convergence results for the components of functional linear
regression, such as the regression parameter function, we construct asymptotic
pointwise confidence bands for the predicted trajectories. A functional
coefficient of determination as a measure of the variance explained by the
functional regression model is introduced, extending the standard to the
functional case. The proposed methods are illustrated with a simulation study,
longitudinal primary biliary liver cirrhosis data and an analysis of the
longitudinal relationship between blood pressure and body mass index.Comment: Published at http://dx.doi.org/10.1214/009053605000000660 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Functional principal component analysis of spatially correlated data
This paper focuses on the analysis of spatially correlated functional data. We propose a parametric model for spatial correlation and the between-curve correlation is modeled by correlating functional principal component scores of the functional data. Additionally, in the sparse observation framework, we propose a novel approach of spatial principal analysis by conditional expectation to explicitly estimate spatial correlations and reconstruct individual curves. Assuming spatial stationarity, empirical spatial correlations are calculated as the ratio of eigenvalues of the smoothed covariance surface Cov (Xi(s),Xi(t))(Xi(s),Xi(t)) and cross-covariance surface Cov (Xi(s),Xj(t))(Xi(s),Xj(t)) at locations indexed by i and j. Then a anisotropy Matérn spatial correlation model is fitted to empirical correlations. Finally, principal component scores are estimated to reconstruct the sparsely observed curves. This framework can naturally accommodate arbitrary covariance structures, but there is an enormous reduction in computation if one can assume the separability of temporal and spatial components. We demonstrate the consistency of our estimates and propose hypothesis tests to examine the separability as well as the isotropy effect of spatial correlation. Using simulation studies, we show that these methods have some clear advantages over existing methods of curve reconstruction and estimation of model parameters
Detecting and handling outlying trajectories in irregularly sampled functional datasets
Outlying curves often occur in functional or longitudinal datasets, and can
be very influential on parameter estimators and very hard to detect visually.
In this article we introduce estimators of the mean and the principal
components that are resistant to, and then can be used for detection of,
outlying sample trajectories. The estimators are based on reduced-rank t-models
and are specifically aimed at sparse and irregularly sampled functional data.
The outlier-resistance properties of the estimators and their relative
efficiency for noncontaminated data are studied theoretically and by
simulation. Applications to the analysis of Internet traffic data and glycated
hemoglobin levels in diabetic children are presented.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS257 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Multi-Rank Sparse and Functional PCA: Manifold Optimization and Iterative Deflation Techniques
We consider the problem of estimating multiple principal components using the
recently-proposed Sparse and Functional Principal Components Analysis (SFPCA)
estimator. We first propose an extension of SFPCA which estimates several
principal components simultaneously using manifold optimization techniques to
enforce orthogonality constraints. While effective, this approach is
computationally burdensome so we also consider iterative deflation approaches
which take advantage of existing fast algorithms for rank-one SFPCA. We show
that alternative deflation schemes can more efficiently extract signal from the
data, in turn improving estimation of subsequent components. Finally, we
compare the performance of our manifold optimization and deflation techniques
in a scenario where orthogonality does not hold and find that they still lead
to significantly improved performance.Comment: To appear in IEEE CAMSAP 201
Covariance Estimation and Principal Component Analysis for Mixed-Type Functional Data with application to mHealth in Mood Disorders
Mobile digital health (mHealth) studies often collect multiple within-day
self-reported assessments of participants' behaviour and health. Indexed by
time of day, these assessments can be treated as functional observations of
continuous, truncated, ordinal, and binary type. We develop covariance
estimation and principal component analysis for mixed-type functional data like
that. We propose a semiparametric Gaussian copula model that assumes a
generalized latent non-paranormal process generating observed mixed-type
functional data and defining temporal dependence via a latent covariance. The
smooth estimate of latent covariance is constructed via Kendall's Tau bridging
method that incorporates smoothness within the bridging step. The approach is
then extended with methods for handling both dense and sparse sampling designs,
calculating subject-specific latent representations of observed data, latent
principal components and principal component scores. Importantly, the proposed
framework handles all four mixed types in a unified way. Simulation studies
show a competitive performance of the proposed method under both dense and
sparse sampling designs. The method is applied to data from 497 participants of
National Institute of Mental Health Family Study of the Mood Disorder Spectrum
to characterize the differences in within-day temporal patterns of mood in
individuals with the major mood disorder subtypes including Major Depressive
Disorder, and Type 1 and 2 Bipolar Disorder
Covariance Function Estimation for High-Dimensional Functional Time Series with Dual Factor Structures
We propose a flexible dual functional factor model for modelling
high-dimensional functional time series. In this model, a high-dimensional
fully functional factor parametrisation is imposed on the observed functional
processes, whereas a low-dimensional version (via series approximation) is
assumed for the latent functional factors. We extend the classic principal
component analysis technique for the estimation of a low-rank structure to the
estimation of a large covariance matrix of random functions that satisfies a
notion of (approximate) functional "low-rank plus sparse" structure; and
generalise the matrix shrinkage method to functional shrinkage in order to
estimate the sparse structure of functional idiosyncratic components. Under
appropriate regularity conditions, we derive the large sample theory of the
developed estimators, including the consistency of the estimated factors and
functional factor loadings and the convergence rates of the estimated matrices
of covariance functions measured by various (functional) matrix norms.
Consistent selection of the number of factors and a data-driven rule to choose
the shrinkage parameter are discussed. Simulation and empirical studies are
provided to demonstrate the finite-sample performance of the developed model
and estimation methodology
- …