10 research outputs found
A tutorial on estimator averaging in spatial point process models
Assume that several competing methods are available to estimate a parameter
in a given statistical model. The aim of estimator averaging is to provide a
new estimator, built as a linear combination of the initial estimators, that
achieves better properties, under the quadratic loss, than each individual
initial estimator. This contribution provides an accessible and clear overview
of the method, and investigates its performances on standard spatial point
process models. It is demonstrated that the average estimator clearly improves
on standard procedures for the considered models. For each example, the code to
implement the method with the R software (which only consists of few lines) is
provided
Sharp Oracle Inequalities for Aggregation of Affine Estimators
We consider the problem of combining a (possibly uncountably infinite) set of
affine estimators in non-parametric regression model with heteroscedastic
Gaussian noise. Focusing on the exponentially weighted aggregate, we prove a
PAC-Bayesian type inequality that leads to sharp oracle inequalities in
discrete but also in continuous settings. The framework is general enough to
cover the combinations of various procedures such as least square regression,
kernel ridge regression, shrinking estimators and many other estimators used in
the literature on statistical inverse problems. As a consequence, we show that
the proposed aggregate provides an adaptive estimator in the exact minimax
sense without neither discretizing the range of tuning parameters nor splitting
the set of observations. We also illustrate numerically the good performance
achieved by the exponentially weighted aggregate
Minimax Optimal Transfer Learning for Kernel-based Nonparametric Regression
In recent years, transfer learning has garnered significant attention in the
machine learning community. Its ability to leverage knowledge from related
studies to improve generalization performance in a target study has made it
highly appealing. This paper focuses on investigating the transfer learning
problem within the context of nonparametric regression over a reproducing
kernel Hilbert space. The aim is to bridge the gap between practical
effectiveness and theoretical guarantees. We specifically consider two
scenarios: one where the transferable sources are known and another where they
are unknown. For the known transferable source case, we propose a two-step
kernel-based estimator by solely using kernel ridge regression. For the unknown
case, we develop a novel method based on an efficient aggregation algorithm,
which can automatically detect and alleviate the effects of negative sources.
This paper provides the statistical properties of the desired estimators and
establishes the minimax optimal rate. Through extensive numerical experiments
on synthetic data and real examples, we validate our theoretical findings and
demonstrate the effectiveness of our proposed method
On Hypothesis Transfer Learning of Functional Linear Models
We study the transfer learning (TL) for the functional linear regression
(FLR) under the Reproducing Kernel Hilbert Space (RKHS) framework, observing
the TL techniques in existing high-dimensional linear regression is not
compatible with the truncation-based FLR methods as functional data are
intrinsically infinite-dimensional and generated by smooth underlying
processes. We measure the similarity across tasks using RKHS distance, allowing
the type of information being transferred tied to the properties of the imposed
RKHS. Building on the hypothesis offset transfer learning paradigm, two
algorithms are proposed: one conducts the transfer when positive sources are
known, while the other leverages aggregation techniques to achieve robust
transfer without prior information about the sources. We establish lower bounds
for this learning problem and show the proposed algorithms enjoy a matching
asymptotic upper bound. These analyses provide statistical insights into
factors that contribute to the dynamics of the transfer. We also extend the
results to functional generalized linear models. The effectiveness of the
proposed algorithms is demonstrated on extensive synthetic data as well as a
financial data application.Comment: The results are extended to functional GL
Low rank estimation of smooth kernels on graphs
Let (V,A) be a weighted graph with a finite vertex set V, with a symmetric
matrix of nonnegative weights A and with Laplacian . Let be a symmetric kernel defined on the vertex set V.
Consider n i.i.d. observations , where
are independent random vertices sampled from the uniform distribution in V and
is a real valued response variable such that
. The goal is to
estimate the kernel based on the data
and under the assumption that is
low rank and, at the same time, smooth on the graph (the smoothness being
characterized by discrete Sobolev norms defined in terms of the graph
Laplacian). We obtain several results for such problems including minimax lower
bounds on the -error and upper bounds for penalized least squares
estimators both with nonconvex and with convex penalties.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1088 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A non-asymptotic study of low-rank estimation of smooth kernels on graphs
This dissertation investigates the problem of estimating a kernel over a large graph based on a sample of noisy observations of linear measurements of the kernel. We are interested in solving this estimation problem in the case when the sample size is much smaller than the ambient dimension of the kernel. As is typical in high-dimensional statistics, we are able to design a suitable estimator based on a small number of samples only when the target kernel belongs to a subset of restricted complexity. In our study, we restrict the complexity by considering scenarios where the target kernel is both low-rank and smooth over a graph. Using standard tools of non-parametric estimation, we derive a minimax lower bound on the least squares error in terms of the rank and the degree of smoothness of the target kernel. To prove the optimality of our lower-bound, we proceed to develop upper bounds on the error for a least-square estimator based on a non-convex penalty. The proof of these upper bounds depends on bounds for estimators over uniformly bounded function classes in terms of Rademacher complexities. We also propose a computationally tractable estimator based on least-squares with convex penalty. We derive an upper bound for the computationally tractable estimator in terms of a coherence function introduced in this work. Finally, we present some scenarios wherein this upper bound achieves a near-optimal rate. The motivations for studying such problems come from various real-world applications like recommender systems and social network analysis.Ph.D
Inégalités d'oracle et mélanges
This manuscript focuses on two functional estimation problems. A non asymptotic guarantee of the proposed estimator’s performances is provided for each problem through an oracle inequality.In the conditional density estimation setting, mixtures of Gaussian regressions with exponential weights depending on the covariate are used. Model selection principle through penalized maximum likelihood estimation is applied and a condition on the penalty is derived. If the chosen penalty is proportional to the model dimension, then the condition is satisfied. This procedure is accompanied by an algorithm mixing EM and Newton algorithm, tested on synthetic and real data sets. In the regression with sub-Gaussian noise framework, aggregating linear estimators using exponential weights allows to obtain an oracle inequality in deviation,thanks to pac-bayesian technics. The main advantage of the proposed estimator is to be easily calculable. Furthermore, taking the infinity norm of the regression function into account allows to establish a continuum between sharp and weak oracle inequalities.Ce manuscrit se concentre sur deux problèmes d'estimation de fonction. Pour chacun, une garantie non asymptotique des performances de l'estimateur proposé est fournie par une inégalité d'oracle. Pour l'estimation de densité conditionnelle, des mélanges de régressions gaussiennes à poids exponentiels dépendant de la covariable sont utilisés. Le principe de sélection de modèle par maximum de vraisemblance pénalisé est appliqué et une condition sur la pénalité est établie. Celle-ci est satisfaite pour une pénalité proportionnelle à la dimension du modèle. Cette procédure s'accompagne d'un algorithme mêlant EM et algorithme de Newton, éprouvé sur données synthétiques et réelles. Dans le cadre de la régression à bruit sous-gaussien, l'agrégation à poids exponentiels d'estimateurs linéaires permet d'obtenir une inégalité d'oracle en déviation, au moyen de techniques PAC-bayésiennes. Le principal avantage de l'estimateur proposé est d'être aisément calculable. De plus, la prise en compte de la norme infinie de la fonction de régression permet d'établir un continuum entre inégalité exacte et inexacte