35 research outputs found

    Kernel estimation of the intensity of Cox processes

    Full text link
    Counting processes often written N=(Nt)t∈R+N=(N_t)_{t\in\mathbb{R}^+} are used in several applications of biostatistics, notably for the study of chronic diseases. In the case of respiratory illness it is natural to suppose that the count of the visits of a patient can be described by such a process which intensity depends on environmental covariates. Cox processes (also called doubly stochastic Poisson processes) allows to model such situations. The random intensity then writes λ(t)=θ(t,Zt)\lambda(t)=\theta(t,Z_t) where θ\theta is a non-random function, t∈R+t\in\mathbb{R}^+ is the time variable and (Zt)t∈R+(Z_t)_{t\in\mathbb{R}^+} is the dd-dimensional covariates process. For a longitudinal study over nn patients, we observe (Ntk,Ztk)t∈R+(N_t^k,Z_t^k)_{t\in\mathbb{R}^+} for k=1,…,nk=1,\ldots,n. The intention is to estimate the intensity of the process using these observations and to study the properties of this estimator

    Adaptive estimation on anisotropic Hölder spaces Part I. Fully adaptive case

    No full text
    In this paper, we consider the following problem: We want to estimate a noisy signal. The main problem is to find a "good" estimator. To do that, we propose a new criterion to chose, among all possible estimators, the best one. This criterion is useful to define the best family of rates of convergence for any adaptive problem (in a minimax sense). Then, we construct an adaptive estimator (over a family of anisotropic Hölder spaces) for the pointwise loss and we prove its optimality in our sense. This estimator looks like the Lepski's procedure

    On clustering procedures and nonparametric mixture estimation

    Full text link
    This paper deals with nonparametric estimation of conditional den-sities in mixture models in the case when additional covariates are available. The proposed approach consists of performing a prelim-inary clustering algorithm on the additional covariates to guess the mixture component of each observation. Conditional densities of the mixture model are then estimated using kernel density estimates ap-plied separately to each cluster. We investigate the expected L 1 -error of the resulting estimates and derive optimal rates of convergence over classical nonparametric density classes provided the clustering method is accurate. Performances of clustering algorithms are measured by the maximal misclassification error. We obtain upper bounds of this quantity for a single linkage hierarchical clustering algorithm. Lastly, applications of the proposed method to mixture models involving elec-tricity distribution data and simulated data are presented

    A new adaptive local polynomial density estimation procedure on complicated domains

    Full text link
    This paper presents a novel approach for pointwise estimation of multivariate density functions on known domains of arbitrary dimensions using nonparametric local polynomial estimators. Our method is highly flexible, as it applies to both simple domains, such as open connected sets, and more complicated domains that are not star-shaped around the point of estimation. This enables us to handle domains with sharp concavities, holes, and local pinches, such as polynomial sectors. Additionally, we introduce a data-driven selection rule based on the general ideas of Goldenshluger and Lepski. Our results demonstrate that the local polynomial estimators are minimax under a L2L^2 risk across a wide range of H\"older-type functional classes. In the adaptive case, we provide oracle inequalities and explicitly determine the convergence rate of our statistical procedure. Simulations on polynomial sectors show that our oracle estimates outperform those of the most popular alternative method, found in the sparr package for the R software. Our statistical procedure is implemented in an online R package which is readily accessible.Comment: 35 pages, 4 figure

    Learning the regularity of multivariate functional data

    Full text link
    Combining information both within and between sample realizations, we propose a simple estimator for the local regularity of surfaces in the functional data framework. The independently generated surfaces are measured with errors at possibly random discrete times. Non-asymptotic exponential bounds for the concentration of the regularity estimators are derived. An indicator for anisotropy is proposed and an exponential bound of its risk is derived. Two applications are proposed. We first consider the class of multi-fractional, bi-dimensional, Brownian sheets with domain deformation, and study the nonparametric estimation of the deformation. As a second application, we build minimax optimal, bivariate kernel estimators for the reconstruction of the surfaces

    Adaptive estimation of irregular mean and covariance functions

    Full text link
    Nonparametric estimators for the mean and the covariance functions of functional data are proposed. The setup covers a wide range of practical situations. The random trajectories are, not necessarily differentiable, have unknown regularity, and are measured with error at discrete design points. The measurement error could be heteroscedastic. The design points could be either randomly drawn or common for all curves. The estimators depend on the local regularity of the stochastic process generating the functional data. We consider a simple estimator of this local regularity which exploits the replication and regularization features of functional data. Next, we use the ``smoothing first, then estimate'' approach for the mean and the covariance functions. They can be applied with both sparsely or densely sampled curves, are easy to calculate and to update, and perform well in simulations. Simulations built upon an example of real data set, illustrate the effectiveness of the new approach

    Minimax properties of Dirichlet kernel density estimators

    Full text link
    This paper is concerned with the asymptotic behavior in β\beta-H\"older spaces and under LpL^p losses of a Dirichlet kernel density estimator proposed by Aitchison & Lauder (1985) for the analysis of compositional data. In recent work, Ouimet & Tolosana-Delgado (2022) established the uniform strong consistency and asymptotic normality of this nonparametric estimator. As a complement, it is shown here that for p∈[1,3)p \in [1, 3) and β∈(0,2]\beta \in (0, 2], the Aitchison--Lauder estimator can achieve the minimax rate asymptotically for a suitable choice of bandwidth, but that this estimator cannot be minimax when either p∈[4,∞)p \in [4, \infty) or β∈(2,∞)\beta \in (2, \infty). These results extend to the multivariate case, and also rectify in a minor way, earlier findings of Bertin & Klutchnikoff (2011) concerning the minimax properties of Beta kernel estimators.Comment: 15 pages, 1 figur

    Adaptive functional principal components analysis

    Full text link
    Functional data analysis (FDA) almost always involves smoothing discrete observations into curves, because they are never observed in continuous time and rarely without error. Although smoothing parameters affect the subsequent inference, data-driven methods for selecting these parameters are not well-developed, frustrated by the difficulty of using all the information shared by curves while being computationally efficient. On the one hand, smoothing individual curves in an isolated, albeit sophisticated way, ignores useful signals present in other curves. On the other hand, bandwidth selection by automatic procedures such as cross-validation after pooling all the curves together quickly become computationally unfeasible due to the large number of data points. In this paper we propose a new data-driven, adaptive kernel smoothing, specifically tailored for functional principal components analysis (FPCA) through the derivation of sharp, explicit risk bounds for the eigen-elements. The minimization of these quadratic risk bounds provide refined, yet computationally efficient bandwidth rules for each eigen-element separately. Both common and independent design cases are allowed. Rates of convergence for the adaptive eigen-elements estimators are derived. An extensive simulation study, designed in a versatile manner to closely mimic characteristics of real data sets, support our methodological contribution, which is available for use in the R package FDAdapt

    Sur l'estimation adaptative de fonctions anisotropes

    No full text
    This thesis is devoted to the study of statistical problems of non parametrical estimation. A noisy multidimensionnal signal is observed (for example an image if the dimension is equal to two) and our goal is to reconstruct it \emph{as best as possible}.In order to achieve this goal, we consider the well known theory of adaptation on a minimax sense : we want to construct a single estimator which achieves on each fuctionnal space of a given collection the "best possible rate".We introduce a new criterion in order to chose an optimal family of normalizations. This criterion is more sophisticated than criteria given by Lepski (1991) and Tsybakov (1998) and well adapted to multidimensionnal case.Then, we prove two results of adaptation with respect to different collections of anisotropic HCette thèse est consacrée à l'étude de problèmes statistiques d'estimation non paramétrique. Un signal bruité multidimensionnel est observé (par exemple une image dans le cas de la dimension deux) et nous nous fixons l'objectif de le reconstruire \emph{au mieux}.Pour réaliser ce but, nous nous plaçons dans le cadre de la théorie adaptative au sens minimax : nous cherchons un seul estimateur qui atteint simultanément sur chaque espace fonctionnel d'une collection la .Nous donnons un nouveau critère pour choisir une famille de normalisations optimale. Ce critère est plus sophistiqué que ceux introduits par Lepski (1991) puis Tsybakov (1998) et est mieux adapté au cas multidimensionnel.Ensuite, nous donnons deux résultats adaptatifs (en estimation ponctuelle) par rapport à deux collections différentes d'espaces de
    corecore