Density estimation on an unknown submanifold

Abstract

We investigate density estimation from a nn-sample in the Euclidean space RD\mathbb R^D, when the data is supported by an unknown submanifold MM of possibly unknown dimension d<Dd < D under a reach condition. We study nonparametric kernel methods for pointwise and integrated loss, with data-driven bandwidths that incorporate some learning of the geometry via a local dimension estimator. When ff has H\"older smoothness β\beta and MM has regularity α\alpha in a sense to be defined, our estimator achieves the rate nαβ/(2αβ+d)n^{-\alpha \wedge \beta/(2\alpha \wedge \beta+d)} and does not depend on the ambient dimension DD and is asymptotically minimax for αβ\alpha \geq \beta. Following Lepski's principle, a bandwidth selection rule is shown to achieve smoothness adaptation. We also investigate the case αβ\alpha \leq \beta: by estimating in some sense the underlying geometry of MM, we establish in dimension d=1d=1 that the minimax rate is nβ/(2β+1)n^{-\beta/(2\beta+1)} proving in particular that it does not depend on the regularity of MM. Finally, a numerical implementation is conducted on some case studies in order to confirm the practical feasibility of our estimators

    Similar works