35 research outputs found
Kernel estimation of the intensity of Cox processes
Counting processes often written are used in
several applications of biostatistics, notably for the study of chronic
diseases. In the case of respiratory illness it is natural to suppose that the
count of the visits of a patient can be described by such a process which
intensity depends on environmental covariates. Cox processes (also called
doubly stochastic Poisson processes) allows to model such situations. The
random intensity then writes where is a
non-random function, is the time variable and
is the -dimensional covariates process. For a
longitudinal study over patients, we observe
for . The intention is to
estimate the intensity of the process using these observations and to study the
properties of this estimator
Adaptive estimation on anisotropic Hölder spaces Part I. Fully adaptive case
In this paper, we consider the following problem: We want to estimate a noisy signal. The main problem is to find a "good" estimator. To do that, we propose a new criterion to chose, among all possible estimators, the best one. This criterion is useful to define the best family of rates of convergence for any adaptive problem (in a minimax sense). Then, we construct an adaptive estimator (over a family of anisotropic Hölder spaces) for the pointwise loss and we prove its optimality in our sense. This estimator looks like the Lepski's procedure
On clustering procedures and nonparametric mixture estimation
This paper deals with nonparametric estimation of conditional den-sities in
mixture models in the case when additional covariates are available. The
proposed approach consists of performing a prelim-inary clustering algorithm on
the additional covariates to guess the mixture component of each observation.
Conditional densities of the mixture model are then estimated using kernel
density estimates ap-plied separately to each cluster. We investigate the
expected L 1 -error of the resulting estimates and derive optimal rates of
convergence over classical nonparametric density classes provided the
clustering method is accurate. Performances of clustering algorithms are
measured by the maximal misclassification error. We obtain upper bounds of this
quantity for a single linkage hierarchical clustering algorithm. Lastly,
applications of the proposed method to mixture models involving elec-tricity
distribution data and simulated data are presented
A new adaptive local polynomial density estimation procedure on complicated domains
This paper presents a novel approach for pointwise estimation of multivariate
density functions on known domains of arbitrary dimensions using nonparametric
local polynomial estimators. Our method is highly flexible, as it applies to
both simple domains, such as open connected sets, and more complicated domains
that are not star-shaped around the point of estimation. This enables us to
handle domains with sharp concavities, holes, and local pinches, such as
polynomial sectors. Additionally, we introduce a data-driven selection rule
based on the general ideas of Goldenshluger and Lepski. Our results demonstrate
that the local polynomial estimators are minimax under a risk across a
wide range of H\"older-type functional classes. In the adaptive case, we
provide oracle inequalities and explicitly determine the convergence rate of
our statistical procedure. Simulations on polynomial sectors show that our
oracle estimates outperform those of the most popular alternative method, found
in the sparr package for the R software. Our statistical procedure is
implemented in an online R package which is readily accessible.Comment: 35 pages, 4 figure
Learning the regularity of multivariate functional data
Combining information both within and between sample realizations, we propose
a simple estimator for the local regularity of surfaces in the functional data
framework. The independently generated surfaces are measured with errors at
possibly random discrete times. Non-asymptotic exponential bounds for the
concentration of the regularity estimators are derived. An indicator for
anisotropy is proposed and an exponential bound of its risk is derived. Two
applications are proposed. We first consider the class of multi-fractional,
bi-dimensional, Brownian sheets with domain deformation, and study the
nonparametric estimation of the deformation. As a second application, we build
minimax optimal, bivariate kernel estimators for the reconstruction of the
surfaces
Adaptive estimation of irregular mean and covariance functions
Nonparametric estimators for the mean and the covariance functions of
functional data are proposed. The setup covers a wide range of practical
situations. The random trajectories are, not necessarily differentiable, have
unknown regularity, and are measured with error at discrete design points. The
measurement error could be heteroscedastic. The design points could be either
randomly drawn or common for all curves. The estimators depend on the local
regularity of the stochastic process generating the functional data. We
consider a simple estimator of this local regularity which exploits the
replication and regularization features of functional data. Next, we use the
``smoothing first, then estimate'' approach for the mean and the covariance
functions. They can be applied with both sparsely or densely sampled curves,
are easy to calculate and to update, and perform well in simulations.
Simulations built upon an example of real data set, illustrate the
effectiveness of the new approach
Minimax properties of Dirichlet kernel density estimators
This paper is concerned with the asymptotic behavior in -H\"older
spaces and under losses of a Dirichlet kernel density estimator proposed
by Aitchison & Lauder (1985) for the analysis of compositional data. In recent
work, Ouimet & Tolosana-Delgado (2022) established the uniform strong
consistency and asymptotic normality of this nonparametric estimator. As a
complement, it is shown here that for and ,
the Aitchison--Lauder estimator can achieve the minimax rate asymptotically for
a suitable choice of bandwidth, but that this estimator cannot be minimax when
either or . These results extend to
the multivariate case, and also rectify in a minor way, earlier findings of
Bertin & Klutchnikoff (2011) concerning the minimax properties of Beta kernel
estimators.Comment: 15 pages, 1 figur
Adaptive functional principal components analysis
Functional data analysis (FDA) almost always involves smoothing discrete
observations into curves, because they are never observed in continuous time
and rarely without error. Although smoothing parameters affect the subsequent
inference, data-driven methods for selecting these parameters are not
well-developed, frustrated by the difficulty of using all the information
shared by curves while being computationally efficient. On the one hand,
smoothing individual curves in an isolated, albeit sophisticated way, ignores
useful signals present in other curves. On the other hand, bandwidth selection
by automatic procedures such as cross-validation after pooling all the curves
together quickly become computationally unfeasible due to the large number of
data points. In this paper we propose a new data-driven, adaptive kernel
smoothing, specifically tailored for functional principal components analysis
(FPCA) through the derivation of sharp, explicit risk bounds for the
eigen-elements. The minimization of these quadratic risk bounds provide
refined, yet computationally efficient bandwidth rules for each eigen-element
separately. Both common and independent design cases are allowed. Rates of
convergence for the adaptive eigen-elements estimators are derived. An
extensive simulation study, designed in a versatile manner to closely mimic
characteristics of real data sets, support our methodological contribution,
which is available for use in the R package FDAdapt
Sur l'estimation adaptative de fonctions anisotropes
This thesis is devoted to the study of statistical problems of non parametrical estimation. A noisy multidimensionnal signal is observed (for example an image if the dimension is equal to two) and our goal is to reconstruct it \emph{as best as possible}.In order to achieve this goal, we consider the well known theory of adaptation on a minimax sense : we want to construct a single estimator which achieves on each fuctionnal space of a given collection the "best possible rate".We introduce a new criterion in order to chose an optimal family of normalizations. This criterion is more sophisticated than criteria given by Lepski (1991) and Tsybakov (1998) and well adapted to multidimensionnal case.Then, we prove two results of adaptation with respect to different collections of anisotropic HCette thèse est consacrée à l'étude de problèmes statistiques d'estimation non paramétrique. Un signal bruité multidimensionnel est observé (par exemple une image dans le cas de la dimension deux) et nous nous fixons l'objectif de le reconstruire \emph{au mieux}.Pour réaliser ce but, nous nous plaçons dans le cadre de la théorie adaptative au sens minimax : nous cherchons un seul estimateur qui atteint simultanément sur chaque espace fonctionnel d'une collection la .Nous donnons un nouveau critère pour choisir une famille de normalisations optimale. Ce critère est plus sophistiqué que ceux introduits par Lepski (1991) puis Tsybakov (1998) et est mieux adapté au cas multidimensionnel.Ensuite, nous donnons deux résultats adaptatifs (en estimation ponctuelle) par rapport à deux collections différentes d'espaces de