Search CORE

2,417 research outputs found

Minimax rates of entropy estimation on large alphabets via best polynomial approximation

Author: Wu Yihong
Yang Pengkun
Publication venue
Publication date: 17/02/2016
Field of study

Consider the problem of estimating the Shannon entropy of a distribution over

k

elements from

n

independent samples. We show that the minimax mean-square error is within universal multiplicative constant factors of

\Big(\frac{k }{n \log k}\Big)^2 + \frac{\log^2 k}{n}

n

exceeds a constant factor of

\frac{k}{\log k}

; otherwise there exists no consistent estimator. This refines the recent result of Valiant-Valiant \cite{VV11} that the minimal sample size for consistent entropy estimation scales according to

\Theta(\frac{k}{\log k})

. The apparatus of best polynomial approximation plays a key role in both the construction of optimal estimators and, via a duality argument, the minimax lower bound

arXiv.org e-Print Archive

Nonparametric density estimation by histogram trend filtering

Author: Padilla Oscar Hernan Madrid
Scott James G.
Publication venue
Publication date: 06/02/2016
Field of study

We propose a novel approach for density estimation called histogram trend filtering. Our estimator arises from looking at surrogate Poisson model for counts of observations in a partition of the support of the data. We begin by showing consistency for a variational estimator for this density estimation problem. We then study a discrete estimator that can be efficiently found via convex optimization. We show that the estimator enjoys strong statistical guarantees, yet is much more practical and computationally efficient than other estimators that enjoy similar guarantees. Finally, in our simulation study the proposed method showed smaller averaged mean square error than competing methods. This favorable blend of properties makes histogram trend filtering an ideal candidate for use in routine data-analysis applications that call for a quick, efficient, accurate density estimate

arXiv.org e-Print Archive

Minimax Estimation of Functionals of Discrete Distributions

Author: Han Yanjun
Jiao Jiantao
Venkat Kartik
Weissman Tsachy
Publication venue
Publication date: 10/03/2015
Field of study

We propose a general methodology for the construction and analysis of minimax estimators for a wide class of functionals of finite dimensional parameters, and elaborate on the case of discrete distributions, where the alphabet size

S

is unknown and may be comparable with the number of observations

n

. We treat the respective regions where the functional is "nonsmooth" and "smooth" separately. In the "nonsmooth" regime, we apply an unbiased estimator for the best polynomial approximation of the functional whereas, in the "smooth" regime, we apply a bias-corrected Maximum Likelihood Estimator (MLE). We illustrate the merit of this approach by thoroughly analyzing two important cases: the entropy

H(P) = \sum_{i = 1}^S -p_i \ln p_i

and

F_\alpha(P) = \sum_{i = 1}^S p_i^\alpha,\alpha>0

. We obtain the minimax

L_2

rates for estimating these functionals. In particular, we demonstrate that our estimator achieves the optimal sample complexity

n \asymp S/\ln S

for entropy estimation. We also show that the sample complexity for estimating

F_\alpha(P),0<\alpha<1

n\asymp S^{1/\alpha}/ \ln S

, which can be achieved by our estimator but not the MLE. For

1<\alpha<3/2

, we show the minimax

L_2

rate for estimating

F_\alpha(P)

(n\ln n)^{-2(\alpha-1)}

regardless of the alphabet size, while the

L_2

rate for the MLE is

n^{-2(\alpha-1)}

. For all the above cases, the behavior of the minimax rate-optimal estimators with

n

samples is essentially that of the MLE with

n\ln n

samples. We highlight the practical advantages of our schemes for entropy and mutual information estimation. We demonstrate that our approach reduces running time and boosts the accuracy compared to existing various approaches. Moreover, we show that the mutual information estimator induced by our methodology leads to significant performance boosts over the Chow--Liu algorithm in learning graphical models.Comment: To appear in IEEE Transactions on Information Theor

arXiv.org e-Print Archive

Methods for Estimation of Convex Sets

Author: Brunel Victor-Emmanuel
Publication venue
Publication date: 21/08/2018
Field of study

In the framework of shape constrained estimation, we review methods and works done in convex set estimation. These methods mostly build on stochastic and convex geometry, empirical process theory, functional analysis, linear programming, extreme value theory, etc. The statistical problems that we review include density support estimation, estimation of the level sets of densities or depth functions, nonparametric regression, etc. We focus on the estimation of convex sets under the Nikodym and Hausdorff metrics, which require different techniques and, quite surprisingly, lead to very different results, in particular in density support estimation. Finally, we discuss computational issues in high dimensions.Comment: 29 page

arXiv.org e-Print Archive

Minimax Rate-Optimal Estimation of Divergences between Discrete Distributions

Author: Han Yanjun
Jiao Jiantao
Weissman Tsachy
Publication venue
Publication date: 25/05/2020
Field of study

We study the minimax estimation of

\alpha

-divergences between discrete distributions for integer

\alpha\ge 1

, which include the Kullback--Leibler divergence and the

\chi^2

-divergences as special examples. Dropping the usual theoretical tricks to acquire independence, we construct the first minimax rate-optimal estimator which does not require any Poissonization, sample splitting, or explicit construction of approximating polynomials. The estimator uses a hybrid approach which solves a problem-independent linear program based on moment matching in the non-smooth regime, and applies a problem-dependent bias-corrected plug-in estimator in the smooth regime, with a soft decision boundary between these regimes.Comment: This version has been significantly revise

arXiv.org e-Print Archive

Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance

Author: Han Yanjun
Jiao Jiantao
Weissman Tsachy
Publication venue
Publication date: 26/06/2018
Field of study

We present \emph{Local Moment Matching (LMM)}, a unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance. We construct an efficiently computable estimator that achieves the minimax rates in estimating the distribution up to permutation, and show that the plug-in approach of our unlabeled distribution estimator is "universal" in estimating symmetric functionals of discrete distributions. Instead of doing best polynomial approximation explicitly as in existing literature of functional estimation, the plug-in approach conducts polynomial approximation implicitly and attains the optimal sample complexity for the entropy, power sum and support size functionals

arXiv.org e-Print Archive

Statistical Challenges with High Dimensionality: Feature Selection in Knowledge Discovery

Author: Fan Jianqing
Li Runze
Publication venue
Publication date: 01/01/2006
Field of study

Technological innovations have revolutionized the process of scientific research and knowledge discovery. The availability of massive data and challenges from frontiers of research and development have reshaped statistical thinking, data analysis and theoretical studies. The challenges of high-dimensionality arise in diverse fields of sciences and the humanities, ranging from computational biology and health studies to financial engineering and risk management. In all of these fields, variable selection and feature extraction are crucial for knowledge discovery. We first give a comprehensive overview of statistical challenges with high dimensionality in these diverse disciplines. We then approach the problem of variable selection and feature extraction using a unified framework: penalized likelihood methods. Issues relevant to the choice of penalty functions are addressed. We demonstrate that for a host of statistical problems, as long as the dimensionality is not excessively large, we can estimate the model parameters as well as if the best model is known in advance. The persistence property in risk minimization is also addressed. The applicability of such a theory and method to diverse statistical problems is demonstrated. Other related problems with high-dimensionality are also discussed.Comment: 2 figure

arXiv.org e-Print Archive

CiteSeerX

Secretaría de Estado de Cultura

Hypotheses tests in boundary regression models

Author: Drees Holger
Neumeyer Natalie
Selk Leonie
Publication venue
Publication date: 11/10/2016
Field of study

Consider a nonparametric regression model with one-sided errors and regression function in a general H\"older class. We estimate the regression function via minimization of the local integral of a polynomial approximation. We show uniform rates of convergence for the simple regression estimator as well as for a smooth version. These rates carry over to mean regression models with a symmetric and bounded error distribution. In such a setting, one obtains faster rates for irregular error distributions concentrating sufficient mass near the endpoints than for the usual regular distributions. The results are applied to prove asymptotic

\sqrt{n}

-equivalence of a residual-based (sequential) empirical distribution function to the (sequential) empirical distribution function of unobserved errors in the case of irregular error distributions. This result is remarkably different from corresponding results in mean regression with regular errors. It can readily be applied to develop goodness-of-fit tests for the error distribution. We present some examples and investigate the small sample performance in a simulation study. We further discuss asymptotically distribution-free hypotheses tests for independence of the error distribution from the points of measurement and for monotonicity of the boundary function as well

arXiv.org e-Print Archive

Learning Multivariate Log-concave Distributions

Author: Diakonikolas Ilias
Kane Daniel M.
Stewart Alistair
Publication venue
Publication date: 05/06/2017
Field of study

We study the problem of estimating multivariate log-concave probability density functions. We prove the first sample complexity upper bound for learning log-concave densities on

\mathbb{R}^d

, for all

d \geq 1

. Prior to our work, no upper bound on the sample complexity of this learning problem was known for the case of

d>3

. In more detail, we give an estimator that, for any

d \ge 1

and

\epsilon>0

, draws

\tilde{O}_d \left( (1/\epsilon)^{(d+5)/2} \right)

samples from an unknown target log-concave density on

\mathbb{R}^d

, and outputs a hypothesis that (with high probability) is

\epsilon

-close to the target, in total variation distance. Our upper bound on the sample complexity comes close to the known lower bound of

\Omega_d \left( (1/\epsilon)^{(d+1)/2} \right)

for this problem.Comment: To appear in COLT 201

arXiv.org e-Print Archive

A Spectral Approach for the Design of Experiments: Design, Analysis and Algorithms

Author: Bremer Peer-Timo
Kailkhura Bhavya
Rastogi Charvi
Thiagarajan Jayaraman J.
Varshney Pramod K.
Publication venue
Publication date: 16/12/2017
Field of study

This paper proposes a new approach to construct high quality space-filling sample designs. First, we propose a novel technique to quantify the space-filling property and optimally trade-off uniformity and randomness in sample designs in arbitrary dimensions. Second, we connect the proposed metric (defined in the spatial domain) to the objective measure of the design performance (defined in the spectral domain). This connection serves as an analytic framework for evaluating the qualitative properties of space-filling designs in general. Using the theoretical insights provided by this spatial-spectral analysis, we derive the notion of optimal space-filling designs, which we refer to as space-filling spectral designs. Third, we propose an efficient estimator to evaluate the space-filling properties of sample designs in arbitrary dimensions and use it to develop an optimization framework to generate high quality space-filling designs. Finally, we carry out a detailed performance comparison on two different applications in 2 to 6 dimensions: a) image reconstruction and b) surrogate modeling on several benchmark optimization functions and an inertial confinement fusion (ICF) simulation code. We demonstrate that the propose spectral designs significantly outperform existing approaches especially in high dimensions

arXiv.org e-Print Archive