24 research outputs found
On the use of reproducing kernel Hilbert spaces in functional classification
The H\'ajek-Feldman dichotomy establishes that two Gaussian measures are
either mutually absolutely continuous with respect to each other (and hence
there is a Radon-Nikodym density for each measure with respect to the other
one) or mutually singular. Unlike the case of finite dimensional Gaussian
measures, there are non-trivial examples of both situations when dealing with
Gaussian stochastic processes. This paper provides:
(a) Explicit expressions for the optimal (Bayes) rule and the minimal
classification error probability in several relevant problems of supervised
binary classification of mutually absolutely continuous Gaussian processes. The
approach relies on some classical results in the theory of Reproducing Kernel
Hilbert Spaces (RKHS).
(b) An interpretation, in terms of mutual singularity, for the "near perfect
classification" phenomenon described by Delaigle and Hall (2012). We show that
the asymptotically optimal rule proposed by these authors can be identified
with the sequence of optimal rules for an approximating sequence of
classification problems in the absolutely continuous case.
(c) A new model-based method for variable selection in binary classification
problems, which arises in a very natural way from the explicit knowledge of the
RN-derivatives and the underlying RKHS structure. Different classifiers might
be used from the selected variables. In particular, the classical, linear
finite-dimensional Fisher rule turns out to be consistent under some standard
conditions on the underlying functional model
On the maximum bias functions of MM-estimates and constrained M-estimates of regression
We derive the maximum bias functions of the MM-estimates and the constrained
M-estimates or CM-estimates of regression and compare them to the maximum bias
functions of the S-estimates and the -estimates of regression. In these
comparisons, the CM-estimates tend to exhibit the most favorable
bias-robustness properties. Also, under the Gaussian model, it is shown how one
can construct a CM-estimate which has a smaller maximum bias function than a
given S-estimate, that is, the resulting CM-estimate dominates the S-estimate
in terms of maxbias and, at the same time, is considerably more efficient.Comment: Published at http://dx.doi.org/10.1214/009053606000000975 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
On a general definition of the functional linear model
A general formulation of the linear model with functional (random)
explanatory variable , and scalar response Y is proposed.
It includes the standard functional linear model, based on the inner product in
the space , as a particular case. It also includes all models in
which Y is assumed to be (up to an additive noise) a linear combination of a
finite or countable collections of marginal variables X(t_j), with
or a linear combination of a finite number of linear projections of X. This
general formulation can be interpreted in terms of the RKHS space generated by
the covariance function of the process X(t). Some consistency results are
proved. A few experimental results are given in order to show the practical
interest of considering, in a unified framework, linear models based on a
finite number of marginals of the process
A geometrically motivated parametric model in manifold estimation,
The general aim of manifold estimation is reconstructing, by statistical
methods, an -dimensional compact manifold on (with
) or estimating some relevant quantities related to the geometric
properties of . We will assume that the sample data are given by the
distances to the -dimensional manifold from points randomly chosen
on a band surrounding , with and . The point in this paper is to
show that, if belongs to a wide class of compact sets (which we call \it
sets with polynomial volume\rm), the proposed statistical model leads to a
relatively simple parametric formulation. In this setup, standard methodologies
(method of moments, maximum likelihood) can be used to estimate some
interesting geometric parameters, including curvatures and Euler
characteristic. We will particularly focus on the estimation of the
-dimensional boundary measure (in Minkowski's sense) of .
It turns out, however, that the estimation problem is not straightforward
since the standard estimators show a remarkably pathological behavior: while
they are consistent and asymptotically normal, their expectations are infinite.
The theoretical and practical consequences of this fact are discussed in some
detail.Comment: Statistics: A Journal of Theoretical and Applied Statistics, 201
On functional logistic regression: some conceptual issues
The main ideas behind the classic multivariate logistic regression model make sense when translated to the functional setting, where the explanatory variable X is a function and the response Y is binary. However, some important technical issues appear (or are aggravated with respect to those of the multivariate case) due to the functional nature of the explanatory variable. First, the mere definition of the model can be questioned: While most approaches so far proposed rely on the L2-based model, we explore an alternative (in some sense, more general) approach, based on the theory of reproducing kernel Hilbert spaces (RKHS). The validity conditions of such RKHS-based model, and their relation with the L2-based one, are investigated and made explicit in two formal results. Some relevant particular cases are considered as well. Second, we show that, under very general conditions, the maximum likelihood of the logistic model parameters fails to exist in the functional case, although some restricted versions can be considered. Third, we check (in the framework of binary classification) the practical performance of some RKHS-based procedures, well-suited to our model: They are compared to several competing methods via Monte Carlo experiments and the analysis of real data setsThis work has been partially supported by Spanish Grant PID2019-109387GB-I0
Uniform strong consistency of robust estimators
In the robustness framework, the distribution underlying the data is not totally specified and, therefore, it is convenient to use estimators whose properties hold uniformly over the whole set of possible distributions. In this paper, we give two general results on uniform strong consistency and apply them to study the uniform consistency of some classes of robust estimators over contamination neighborhoods. Some instances covered by our results are Huber's M-estimators, quantiles, or generalized S-estimators.Uniform strong consistency Robustness M-estimators GS-estimators