10 research outputs found

    A tutorial on estimator averaging in spatial point process models

    Get PDF
    Assume that several competing methods are available to estimate a parameter in a given statistical model. The aim of estimator averaging is to provide a new estimator, built as a linear combination of the initial estimators, that achieves better properties, under the quadratic loss, than each individual initial estimator. This contribution provides an accessible and clear overview of the method, and investigates its performances on standard spatial point process models. It is demonstrated that the average estimator clearly improves on standard procedures for the considered models. For each example, the code to implement the method with the R software (which only consists of few lines) is provided

    Sharp Oracle Inequalities for Aggregation of Affine Estimators

    Get PDF
    We consider the problem of combining a (possibly uncountably infinite) set of affine estimators in non-parametric regression model with heteroscedastic Gaussian noise. Focusing on the exponentially weighted aggregate, we prove a PAC-Bayesian type inequality that leads to sharp oracle inequalities in discrete but also in continuous settings. The framework is general enough to cover the combinations of various procedures such as least square regression, kernel ridge regression, shrinking estimators and many other estimators used in the literature on statistical inverse problems. As a consequence, we show that the proposed aggregate provides an adaptive estimator in the exact minimax sense without neither discretizing the range of tuning parameters nor splitting the set of observations. We also illustrate numerically the good performance achieved by the exponentially weighted aggregate

    Minimax Optimal Transfer Learning for Kernel-based Nonparametric Regression

    Full text link
    In recent years, transfer learning has garnered significant attention in the machine learning community. Its ability to leverage knowledge from related studies to improve generalization performance in a target study has made it highly appealing. This paper focuses on investigating the transfer learning problem within the context of nonparametric regression over a reproducing kernel Hilbert space. The aim is to bridge the gap between practical effectiveness and theoretical guarantees. We specifically consider two scenarios: one where the transferable sources are known and another where they are unknown. For the known transferable source case, we propose a two-step kernel-based estimator by solely using kernel ridge regression. For the unknown case, we develop a novel method based on an efficient aggregation algorithm, which can automatically detect and alleviate the effects of negative sources. This paper provides the statistical properties of the desired estimators and establishes the minimax optimal rate. Through extensive numerical experiments on synthetic data and real examples, we validate our theoretical findings and demonstrate the effectiveness of our proposed method

    On Hypothesis Transfer Learning of Functional Linear Models

    Full text link
    We study the transfer learning (TL) for the functional linear regression (FLR) under the Reproducing Kernel Hilbert Space (RKHS) framework, observing the TL techniques in existing high-dimensional linear regression is not compatible with the truncation-based FLR methods as functional data are intrinsically infinite-dimensional and generated by smooth underlying processes. We measure the similarity across tasks using RKHS distance, allowing the type of information being transferred tied to the properties of the imposed RKHS. Building on the hypothesis offset transfer learning paradigm, two algorithms are proposed: one conducts the transfer when positive sources are known, while the other leverages aggregation techniques to achieve robust transfer without prior information about the sources. We establish lower bounds for this learning problem and show the proposed algorithms enjoy a matching asymptotic upper bound. These analyses provide statistical insights into factors that contribute to the dynamics of the transfer. We also extend the results to functional generalized linear models. The effectiveness of the proposed algorithms is demonstrated on extensive synthetic data as well as a financial data application.Comment: The results are extended to functional GL

    Low rank estimation of smooth kernels on graphs

    Full text link
    Let (V,A) be a weighted graph with a finite vertex set V, with a symmetric matrix of nonnegative weights A and with Laplacian Δ\Delta. Let S∗:V×V↦RS_*:V\times V\mapsto{\mathbb{R}} be a symmetric kernel defined on the vertex set V. Consider n i.i.d. observations (Xj,Xj′,Yj),j=1,…,n(X_j,X_j',Y_j),j=1,\ldots,n, where Xj,Xj′X_j,X_j' are independent random vertices sampled from the uniform distribution in V and Yj∈RY_j\in{\mathbb{R}} is a real valued response variable such that E(Yj∣Xj,Xj′)=S∗(Xj,Xj′),j=1,…,n{\mathbb{E}}(Y_j|X_j,X_j')=S_*(X_j,X_j'),j=1,\ldots,n. The goal is to estimate the kernel S∗S_* based on the data (X1,X1′,Y1),…,(Xn,Xn′,Yn)(X_1,X_1',Y_1),\ldots,(X_n,X_n',Y_n) and under the assumption that S∗S_* is low rank and, at the same time, smooth on the graph (the smoothness being characterized by discrete Sobolev norms defined in terms of the graph Laplacian). We obtain several results for such problems including minimax lower bounds on the L2L_2-error and upper bounds for penalized least squares estimators both with nonconvex and with convex penalties.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1088 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A non-asymptotic study of low-rank estimation of smooth kernels on graphs

    Get PDF
    This dissertation investigates the problem of estimating a kernel over a large graph based on a sample of noisy observations of linear measurements of the kernel. We are interested in solving this estimation problem in the case when the sample size is much smaller than the ambient dimension of the kernel. As is typical in high-dimensional statistics, we are able to design a suitable estimator based on a small number of samples only when the target kernel belongs to a subset of restricted complexity. In our study, we restrict the complexity by considering scenarios where the target kernel is both low-rank and smooth over a graph. Using standard tools of non-parametric estimation, we derive a minimax lower bound on the least squares error in terms of the rank and the degree of smoothness of the target kernel. To prove the optimality of our lower-bound, we proceed to develop upper bounds on the error for a least-square estimator based on a non-convex penalty. The proof of these upper bounds depends on bounds for estimators over uniformly bounded function classes in terms of Rademacher complexities. We also propose a computationally tractable estimator based on least-squares with convex penalty. We derive an upper bound for the computationally tractable estimator in terms of a coherence function introduced in this work. Finally, we present some scenarios wherein this upper bound achieves a near-optimal rate. The motivations for studying such problems come from various real-world applications like recommender systems and social network analysis.Ph.D

    Inégalités d'oracle et mélanges

    Get PDF
    This manuscript focuses on two functional estimation problems. A non asymptotic guarantee of the proposed estimator’s performances is provided for each problem through an oracle inequality.In the conditional density estimation setting, mixtures of Gaussian regressions with exponential weights depending on the covariate are used. Model selection principle through penalized maximum likelihood estimation is applied and a condition on the penalty is derived. If the chosen penalty is proportional to the model dimension, then the condition is satisfied. This procedure is accompanied by an algorithm mixing EM and Newton algorithm, tested on synthetic and real data sets. In the regression with sub-Gaussian noise framework, aggregating linear estimators using exponential weights allows to obtain an oracle inequality in deviation,thanks to pac-bayesian technics. The main advantage of the proposed estimator is to be easily calculable. Furthermore, taking the infinity norm of the regression function into account allows to establish a continuum between sharp and weak oracle inequalities.Ce manuscrit se concentre sur deux problèmes d'estimation de fonction. Pour chacun, une garantie non asymptotique des performances de l'estimateur proposé est fournie par une inégalité d'oracle. Pour l'estimation de densité conditionnelle, des mélanges de régressions gaussiennes à poids exponentiels dépendant de la covariable sont utilisés. Le principe de sélection de modèle par maximum de vraisemblance pénalisé est appliqué et une condition sur la pénalité est établie. Celle-ci est satisfaite pour une pénalité proportionnelle à la dimension du modèle. Cette procédure s'accompagne d'un algorithme mêlant EM et algorithme de Newton, éprouvé sur données synthétiques et réelles. Dans le cadre de la régression à bruit sous-gaussien, l'agrégation à poids exponentiels d'estimateurs linéaires permet d'obtenir une inégalité d'oracle en déviation, au moyen de techniques PAC-bayésiennes. Le principal avantage de l'estimateur proposé est d'être aisément calculable. De plus, la prise en compte de la norme infinie de la fonction de régression permet d'établir un continuum entre inégalité exacte et inexacte
    corecore