6,550 research outputs found
Functional Regression
Functional data analysis (FDA) involves the analysis of data whose ideal
units of observation are functions defined on some continuous domain, and the
observed data consist of a sample of functions taken from some population,
sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the
development of this field, which has accelerated in the past 10 years to become
one of the fastest growing areas of statistics, fueled by the growing number of
applications yielding this type of data. One unique characteristic of FDA is
the need to combine information both across and within functions, which Ramsay
and Silverman called replication and regularization, respectively. This article
will focus on functional regression, the area of FDA that has received the most
attention in applications and methodological development. First will be an
introduction to basis functions, key building blocks for regularization in
functional regression methods, followed by an overview of functional regression
methods, split into three types: [1] functional predictor regression
(scalar-on-function), [2] functional response regression (function-on-scalar)
and [3] function-on-function regression. For each, the role of replication and
regularization will be discussed and the methodological development described
in a roughly chronological manner, at times deviating from the historical
timeline to group together similar methods. The primary focus is on modeling
and methodology, highlighting the modeling structures that have been developed
and the various regularization approaches employed. At the end is a brief
discussion describing potential areas of future development in this field
Optimal Bayes Classifiers for Functional Data and Density Ratios
Bayes classifiers for functional data pose a challenge. This is because
probability density functions do not exist for functional data. As a
consequence, the classical Bayes classifier using density quotients needs to be
modified. We propose to use density ratios of projections on a sequence of
eigenfunctions that are common to the groups to be classified. The density
ratios can then be factored into density ratios of individual functional
principal components whence the classification problem is reduced to a sequence
of nonparametric one-dimensional density estimates. This is an extension to
functional data of some of the very earliest nonparametric Bayes classifiers
that were based on simple density ratios in the one-dimensional case. By means
of the factorization of the density quotients the curse of dimensionality that
would otherwise severely affect Bayes classifiers for functional data can be
avoided. We demonstrate that in the case of Gaussian functional data, the
proposed functional Bayes classifier reduces to a functional version of the
classical quadratic discriminant. A study of the asymptotic behavior of the
proposed classifiers in the large sample limit shows that under certain
conditions the misclassification rate converges to zero, a phenomenon that has
been referred to as "perfect classification". The proposed classifiers also
perform favorably in finite sample applications, as we demonstrate in
comparisons with other functional classifiers in simulations and various data
applications, including wine spectral data, functional magnetic resonance
imaging (fMRI) data for attention deficit hyperactivity disorder (ADHD)
patients, and yeast gene expression data
Joint modeling of longitudinal drug using pattern and time to first relapse in cocaine dependence treatment data
An important endpoint variable in a cocaine rehabilitation study is the time
to first relapse of a patient after the treatment. We propose a joint modeling
approach based on functional data analysis to study the relationship between
the baseline longitudinal cocaine-use pattern and the interval censored time to
first relapse. For the baseline cocaine-use pattern, we consider both
self-reported cocaine-use amount trajectories and dichotomized use
trajectories. Variations within the generalized longitudinal trajectories are
modeled through a latent Gaussian process, which is characterized by a few
leading functional principal components. The association between the baseline
longitudinal trajectories and the time to first relapse is built upon the
latent principal component scores. The mean and the eigenfunctions of the
latent Gaussian process as well as the hazard function of time to first relapse
are modeled nonparametrically using penalized splines, and the parameters in
the joint model are estimated by a Monte Carlo EM algorithm based on
Metropolis-Hastings steps. An Akaike information criterion (AIC) based on
effective degrees of freedom is proposed to choose the tuning parameters, and a
modified empirical information is proposed to estimate the variance-covariance
matrix of the estimators.Comment: Published at http://dx.doi.org/10.1214/15-AOAS852 in the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Nonparametric Bayes modeling of count processes
Data on count processes arise in a variety of applications, including
longitudinal, spatial and imaging studies measuring count responses. The
literature on statistical models for dependent count data is dominated by
models built from hierarchical Poisson components. The Poisson assumption is
not warranted in many applications, and hierarchical Poisson models make
restrictive assumptions about over-dispersion in marginal distributions. This
article proposes a class of nonparametric Bayes count process models, which are
constructed through rounding real-valued underlying processes. The proposed
class of models accommodates applications in which one observes separate
count-valued functional data for each subject under study. Theoretical results
on large support and posterior consistency are established, and computational
algorithms are developed using Markov chain Monte Carlo. The methods are
evaluated via simulation studies and illustrated through application to
longitudinal tumor counts and asthma inhaler usage
Functional linear regression analysis for longitudinal data
We propose nonparametric methods for functional linear regression which are
designed for sparse longitudinal data, where both the predictor and response
are functions of a covariate such as time. Predictor and response processes
have smooth random trajectories, and the data consist of a small number of
noisy repeated measurements made at irregular times for a sample of subjects.
In longitudinal studies, the number of repeated measurements per subject is
often small and may be modeled as a discrete random number and, accordingly,
only a finite and asymptotically nonincreasing number of measurements are
available for each subject or experimental unit. We propose a functional
regression approach for this situation, using functional principal component
analysis, where we estimate the functional principal component scores through
conditional expectations. This allows the prediction of an unobserved response
trajectory from sparse measurements of a predictor trajectory. The resulting
technique is flexible and allows for different patterns regarding the timing of
the measurements obtained for predictor and response trajectories. Asymptotic
properties for a sample of subjects are investigated under mild conditions,
as , and we obtain consistent estimation for the regression
function. Besides convergence results for the components of functional linear
regression, such as the regression parameter function, we construct asymptotic
pointwise confidence bands for the predicted trajectories. A functional
coefficient of determination as a measure of the variance explained by the
functional regression model is introduced, extending the standard to the
functional case. The proposed methods are illustrated with a simulation study,
longitudinal primary biliary liver cirrhosis data and an analysis of the
longitudinal relationship between blood pressure and body mass index.Comment: Published at http://dx.doi.org/10.1214/009053605000000660 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …