1,639 research outputs found
An RKHS Approach for Variable Selection in High-dimensional Functional Linear Models
High-dimensional functional data has become increasingly prevalent in modern
applications such as high-frequency financial data and neuroimaging data
analysis. We investigate a class of high-dimensional linear regression models,
where each predictor is a random element in an infinite dimensional function
space, and the number of functional predictors p can potentially be much
greater than the sample size n. Assuming that each of the unknown coefficient
functions belongs to some reproducing kernel Hilbert space (RKHS), we
regularized the fitting of the model by imposing a group elastic-net type of
penalty on the RKHS norms of the coefficient functions. We show that our loss
function is Gateaux sub-differentiable, and our functional elastic-net
estimator exists uniquely in the product RKHS. Under suitable sparsity
assumptions and a functional version of the irrepresentible condition, we
derive a non-asymptotic tail bound for the variable selection consistency of
our method. The proposed method is illustrated through simulation studies and a
real-data application from the Human Connectome Project
Generalized Linear Models for Geometrical Current predictors. An application to predict garment fit
The aim of this paper is to model an ordinal response variable in terms
of vector-valued functional data included on a vector-valued RKHS. In particular,
we focus on the vector-valued RKHS obtained when a geometrical object (body) is
characterized by a current and on the ordinal regression model. A common way to
solve this problem in functional data analysis is to express the data in the orthonormal
basis given by decomposition of the covariance operator. But our data present very important differences with respect to the usual functional data setting. On the one
hand, they are vector-valued functions, and on the other, they are functions in an
RKHS with a previously defined norm. We propose to use three different bases: the
orthonormal basis given by the kernel that defines the RKHS, a basis obtained from
decomposition of the integral operator defined using the covariance function, and a
third basis that combines the previous two. The three approaches are compared and
applied to an interesting problem: building a model to predict the fit of children’s
garment sizes, based on a 3D database of the Spanish child population. Our proposal
has been compared with alternative methods that explore the performance of other
classifiers (Suppport Vector Machine and k-NN), and with the result of applying
the classification method proposed in this work, from different characterizations of
the objects (landmarks and multivariate anthropometric measurements instead of
currents), obtaining in all these cases worst results
Bayesian Approximate Kernel Regression with Variable Selection
Nonlinear kernel regression models are often used in statistics and machine
learning because they are more accurate than linear models. Variable selection
for kernel regression models is a challenge partly because, unlike the linear
regression setting, there is no clear concept of an effect size for regression
coefficients. In this paper, we propose a novel framework that provides an
effect size analog of each explanatory variable for Bayesian kernel regression
models when the kernel is shift-invariant --- for example, the Gaussian kernel.
We use function analytic properties of shift-invariant reproducing kernel
Hilbert spaces (RKHS) to define a linear vector space that: (i) captures
nonlinear structure, and (ii) can be projected onto the original explanatory
variables. The projection onto the original explanatory variables serves as an
analog of effect sizes. The specific function analytic property we use is that
shift-invariant kernel functions can be approximated via random Fourier bases.
Based on the random Fourier expansion we propose a computationally efficient
class of Bayesian approximate kernel regression (BAKR) models for both
nonlinear regression and binary classification for which one can compute an
analog of effect sizes. We illustrate the utility of BAKR by examining two
important problems in statistical genetics: genomic selection (i.e. phenotypic
prediction) and association mapping (i.e. inference of significant variants or
loci). State-of-the-art methods for genomic selection and association mapping
are based on kernel regression and linear models, respectively. BAKR is the
first method that is competitive in both settings.Comment: 22 pages, 3 figures, 3 tables; theory added; new simulations
presented; references adde
On the use of reproducing kernel Hilbert spaces in functional classification
The H\'ajek-Feldman dichotomy establishes that two Gaussian measures are
either mutually absolutely continuous with respect to each other (and hence
there is a Radon-Nikodym density for each measure with respect to the other
one) or mutually singular. Unlike the case of finite dimensional Gaussian
measures, there are non-trivial examples of both situations when dealing with
Gaussian stochastic processes. This paper provides:
(a) Explicit expressions for the optimal (Bayes) rule and the minimal
classification error probability in several relevant problems of supervised
binary classification of mutually absolutely continuous Gaussian processes. The
approach relies on some classical results in the theory of Reproducing Kernel
Hilbert Spaces (RKHS).
(b) An interpretation, in terms of mutual singularity, for the "near perfect
classification" phenomenon described by Delaigle and Hall (2012). We show that
the asymptotically optimal rule proposed by these authors can be identified
with the sequence of optimal rules for an approximating sequence of
classification problems in the absolutely continuous case.
(c) A new model-based method for variable selection in binary classification
problems, which arises in a very natural way from the explicit knowledge of the
RN-derivatives and the underlying RKHS structure. Different classifiers might
be used from the selected variables. In particular, the classical, linear
finite-dimensional Fisher rule turns out to be consistent under some standard
conditions on the underlying functional model
- …