9,848 research outputs found

    Optimal estimation of high-order missing masses, and the rare-type match problem

    Full text link
    Consider a random sample (X1,…,Xn)(X_{1},\ldots,X_{n}) from an unknown discrete distribution P=βˆ‘jβ‰₯1pjΞ΄sjP=\sum_{j\geq1}p_{j}\delta_{s_{j}} on a countable alphabet S\mathbb{S}, and let (Yn,j)jβ‰₯1(Y_{n,j})_{j\geq1} be the empirical frequencies of distinct symbols sjs_{j}'s in the sample. We consider the problem of estimating the rr-order missing mass, which is a discrete functional of PP defined as ΞΈr(P;Xn)=βˆ‘jβ‰₯1pjrI(Yn,j=0).\theta_{r}(P;\mathbf{X}_{n})=\sum_{j\geq1}p^{r}_{j}I(Y_{n,j}=0). This is generalization of the missing mass whose estimation is a classical problem in statistics, being the subject of numerous studies both in theory and methods. First, we introduce a nonparametric estimator of ΞΈr(P;Xn)\theta_{r}(P;\mathbf{X}_{n}) and a corresponding non-asymptotic confidence interval through concentration properties of ΞΈr(P;Xn)\theta_{r}(P;\mathbf{X}_{n}). Then, we investigate minimax estimation of ΞΈr(P;Xn)\theta_{r}(P;\mathbf{X}_{n}), which is the main contribution of our work. We show that minimax estimation is not feasible over the class of all discrete distributions on S\mathbb{S}, and not even for distributions with regularly varying tails, which only guarantee that our estimator is consistent for ΞΈr(P;Xn)\theta_{r}(P;\mathbf{X}_{n}). This leads to introduce the stronger assumption of second-order regular variation for the tail behaviour of PP, which is proved to be sufficient for minimax estimation of ΞΈr(P;Xn)\theta_r(P;\mathbf{X}_{n}), making the proposed estimator an optimal minimax estimator of ΞΈr(P;Xn)\theta_{r}(P;\mathbf{X}_{n}). Our interest in the rr-order missing mass arises from forensic statistics, where the estimation of the 22-order missing mass appears in connection to the estimation of the likelihood ratio T(P,Xn)=ΞΈ1(P;Xn)/ΞΈ2(P;Xn)T(P,\mathbf{X}_{n})=\theta_{1}(P;\mathbf{X}_{n})/\theta_{2}(P;\mathbf{X}_{n}), known as the "fundamental problem of forensic mathematics". We present theoretical guarantees to nonparametric estimation of T(P,Xn)T(P,\mathbf{X}_{n})

    Minimax Estimation of Kernel Mean Embeddings

    Full text link
    In this paper, we study the minimax estimation of the Bochner integral ΞΌk(P):=∫Xk(β‹…,x) dP(x),\mu_k(P):=\int_{\mathcal{X}} k(\cdot,x)\,dP(x), also called as the kernel mean embedding, based on random samples drawn i.i.d.~from PP, where k:XΓ—Xβ†’Rk:\mathcal{X}\times\mathcal{X}\rightarrow\mathbb{R} is a positive definite kernel. Various estimators (including the empirical estimator), ΞΈ^n\hat{\theta}_n of ΞΌk(P)\mu_k(P) are studied in the literature wherein all of them satisfy βˆ₯ΞΈ^nβˆ’ΞΌk(P)βˆ₯Hk=OP(nβˆ’1/2)\bigl\| \hat{\theta}_n-\mu_k(P)\bigr\|_{\mathcal{H}_k}=O_P(n^{-1/2}) with Hk\mathcal{H}_k being the reproducing kernel Hilbert space induced by kk. The main contribution of the paper is in showing that the above mentioned rate of nβˆ’1/2n^{-1/2} is minimax in βˆ₯β‹…βˆ₯Hk\|\cdot\|_{\mathcal{H}_k} and βˆ₯β‹…βˆ₯L2(Rd)\|\cdot\|_{L^2(\mathbb{R}^d)}-norms over the class of discrete measures and the class of measures that has an infinitely differentiable density, with kk being a continuous translation-invariant kernel on Rd\mathbb{R}^d. The interesting aspect of this result is that the minimax rate is independent of the smoothness of the kernel and the density of PP (if it exists). This result has practical consequences in statistical applications as the mean embedding has been widely employed in non-parametric hypothesis testing, density estimation, causal inference and feature selection, through its relation to energy distance (and distance covariance)
    • …
    corecore