1 research outputs found

    Minimax Estimation of the L1L_1 Distance

    Full text link
    We consider the problem of estimating the L1L_1 distance between two discrete probability measures PP and QQ from empirical data in a nonasymptotic and large alphabet setting. When QQ is known and one obtains nn samples from PP, we show that for every QQ, the minimax rate-optimal estimator with nn samples achieves performance comparable to that of the maximum likelihood estimator (MLE) with nln⁑nn\ln n samples. When both PP and QQ are unknown, we construct minimax rate-optimal estimators whose worst case performance is essentially that of the known QQ case with QQ being uniform, implying that QQ being uniform is essentially the most difficult case. The \emph{effective sample size enlargement} phenomenon, identified in Jiao \emph{et al.} (2015), holds both in the known QQ case for every QQ and the QQ unknown case. However, the construction of optimal estimators for βˆ₯Pβˆ’Qβˆ₯1\|P-Q\|_1 requires new techniques and insights beyond the approximation-based method of functional estimation in Jiao \emph{et al.} (2015).Comment: to appear on IEEE Transactions on Information Theor
    corecore