81 research outputs found
Ensemble Estimation of Information Divergence
Recent work has focused on the problem of nonparametric estimation of information divergence functionals between two continuous random variables. Many existing approaches require either restrictive assumptions about the density support set or difficult calculations at the support set boundary which must be known a priori. The mean squared error (MSE) convergence rate of a leave-one-out kernel density plug-in divergence functional estimator for general bounded density support sets is derived where knowledge of the support boundary, and therefore, the boundary correction is not required. The theory of optimally weighted ensemble estimation is generalized to derive a divergence estimator that achieves the parametric rate when the densities are sufficiently smooth. Guidelines for the tuning parameter selection and the asymptotic distribution of this estimator are provided. Based on the theory, an empirical estimator of Rényi-α divergence is proposed that greatly outperforms the standard kernel density plug-in estimator in terms of mean squared error, especially in high dimensions. The estimator is shown to be robust to the choice of tuning parameters. We show extensive simulation results that verify the theoretical results of our paper. Finally, we apply the proposed estimator to estimate the bounds on the Bayes error rate of a cell classification problem
Recommended from our members
Distributionally Robust Performance Analysis with Applications to Mine Valuation and Risk
We consider several problems motivated by issues faced in the mining industry. In recent years, it has become clear that mines have substantial tail risk in the form of environmental disasters, and this tail risk is not incorporated into common pricing and risk models. However, data sets of the extremal climate behavior that drive this risk are very small, and generally inadequate for properly estimating the tail behavior. We propose a data-driven methodology that comes up with reasonable worst-case scenarios, given the data size constraints, and we incorporate this into a real options based model for the valuation of mines. We propose several different iterations of the model, to allow the end-user to choose the degree to which they wish to specify the financial consequences of the disaster scenario. Next, in order to perform a risk analysis on a portfolio of mines, we propose a method of estimating the correlation structure of high-dimensional max-stable processes. Using the techniques of (Liu Et al, 2017) to map the relationship between normal correlations and max-stable correlations, we can then use techniques inspired by (Bickel et al, 2008, Liu et al, 2014, Rothman et al, 2009) to estimate the underlying correlation matrix, while preserving a sparse, positive-definite structure. The correlation matrices are then used in the calculation of model-robust risk metrics (VaR, CVAR) using the the Sample-Out-of-Sample methodology (Blanchet and Kang, 2017). We conclude with several new techniques that were developed in the field of robust performance analysis, that while not directly applied to mining, were motivated by our studies into distributionally robust optimization in order to address these problems
- …