2,877 research outputs found
Functional Regression
Functional data analysis (FDA) involves the analysis of data whose ideal
units of observation are functions defined on some continuous domain, and the
observed data consist of a sample of functions taken from some population,
sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the
development of this field, which has accelerated in the past 10 years to become
one of the fastest growing areas of statistics, fueled by the growing number of
applications yielding this type of data. One unique characteristic of FDA is
the need to combine information both across and within functions, which Ramsay
and Silverman called replication and regularization, respectively. This article
will focus on functional regression, the area of FDA that has received the most
attention in applications and methodological development. First will be an
introduction to basis functions, key building blocks for regularization in
functional regression methods, followed by an overview of functional regression
methods, split into three types: [1] functional predictor regression
(scalar-on-function), [2] functional response regression (function-on-scalar)
and [3] function-on-function regression. For each, the role of replication and
regularization will be discussed and the methodological development described
in a roughly chronological manner, at times deviating from the historical
timeline to group together similar methods. The primary focus is on modeling
and methodology, highlighting the modeling structures that have been developed
and the various regularization approaches employed. At the end is a brief
discussion describing potential areas of future development in this field
Recommended from our members
Robust variable selection in partially varying coefficient single-index model
By combining basis function approximations and smoothly clipped absolute deviation (SCAD) penalty, this paper proposes a robust variable selection procedure for a partially varying coefficient single-index model based on modal regression. The proposed procedure simultaneously selects significant variables in the parametric components and the nonparametric components. With appropriate selection of the tuning parameters, we establish the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. Furthermore, we also discuss the bandwidth selection and propose a modified expectation-maximization (EM)-type algorithm for the proposed estimation procedure. The finite sample properties of the proposed estimators are illustrated by some simulation examples.The research of Zhu is partially supported by National Natural Science Foundation of China (NNSFC) under Grants 71171075, 71221001 and 71031004. The research of Yu is supported by NNSFC under Grant 11261048
Variable Selection and Model Choice in Geoadditive Regression Models
Model choice and variable selection are issues of major concern in practical regression analyses. We propose a boosting procedure that facilitates both tasks in a class of complex geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, random effects, and varying coefficient terms. The major modelling component are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a remaining smooth component with one degree of freedom to obtain a fair comparison between all model terms. A generic representation of the geoadditive model allows to devise a general boosting algorithm that implements automatic model choice and variable selection. We demonstrate the versatility of our approach with two examples: a geoadditive Poisson regression
model for species counts in habitat suitability analyses and a geoadditive logit model for the analysis of forest health
A Unified Framework of Constrained Regression
Generalized additive models (GAMs) play an important role in modeling and
understanding complex relationships in modern applied statistics. They allow
for flexible, data-driven estimation of covariate effects. Yet researchers
often have a priori knowledge of certain effects, which might be monotonic or
periodic (cyclic) or should fulfill boundary conditions. We propose a unified
framework to incorporate these constraints for both univariate and bivariate
effect estimates and for varying coefficients. As the framework is based on
component-wise boosting methods, variables can be selected intrinsically, and
effects can be estimated for a wide range of different distributional
assumptions. Bootstrap confidence intervals for the effect estimates are
derived to assess the models. We present three case studies from environmental
sciences to illustrate the proposed seamless modeling framework. All discussed
constrained effect estimates are implemented in the comprehensive R package
mboost for model-based boosting.Comment: This is a preliminary version of the manuscript. The final
publication is available at
http://link.springer.com/article/10.1007/s11222-014-9520-
Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models
Structured additive regression provides a general framework for complex
Gaussian and non-Gaussian regression models, with predictors comprising
arbitrary combinations of nonlinear functions and surfaces, spatial effects,
varying coefficients, random effects and further regression terms. The large
flexibility of structured additive regression makes function selection a
challenging and important task, aiming at (1) selecting the relevant
covariates, (2) choosing an appropriate and parsimonious representation of the
impact of covariates on the predictor and (3) determining the required
interactions. We propose a spike-and-slab prior structure for function
selection that allows to include or exclude single coefficients as well as
blocks of coefficients representing specific model terms. A novel
multiplicative parameter expansion is required to obtain good mixing and
convergence properties in a Markov chain Monte Carlo simulation approach and is
shown to induce desirable shrinkage properties. In simulation studies and with
(real) benchmark classification data, we investigate sensitivity to
hyperparameter settings and compare performance to competitors. The flexibility
and applicability of our approach are demonstrated in an additive piecewise
exponential model with time-varying effects for right-censored survival times
of intensive care patients with sepsis. Geoadditive and additive mixed logit
model applications are discussed in an extensive appendix
Estimation of Single-Index Models Based on Boosting Techniques
In single-index models the link or response function is not considered as fixed. The data determine the form of the unknown link function. In order to obtain a flexible form of the link function we specify the link function as an expansion in basis function and propose to estimate parameters as well as the link function by weak learners within a boosting framework. It is shown that the method is a strong competitor to existing methods. The method is investigated in simulation studies and applied to real data
- …