196 research outputs found

    On accuracy of PDF divergence estimators and their applicability to representative data sampling

    Get PDF
    Generalisation error estimation is an important issue in machine learning. Cross-validation traditionally used for this purpose requires building multiple models and repeating the whole procedure many times in order to produce reliable error estimates. It is however possible to accurately estimate the error using only a single model, if the training and test data are chosen appropriately. This paper investigates the possibility of using various probability density function divergence measures for the purpose of representative data sampling. As it turned out, the first difficulty one needs to deal with is estimation of the divergence itself. In contrast to other publications on this subject, the experimental results provided in this study show that in many cases it is not possible unless samples consisting of thousands of instances are used. Exhaustive experiments on the divergence guided representative data sampling have been performed using 26 publicly available benchmark datasets and 70 PDF divergence estimators, and their results have been analysed and discussed

    A note on Onicescu's informational energy and correlation coefficient in exponential families

    Full text link
    The informational energy of Onicescu is a positive quantity that measures the amount of uncertainty of a random variable like Shannon's entropy. In this note, we report closed-form formula for Onicescu's informational energy and correlation coefficient when the densities belong to an exponential family. We also report as a byproduct a closed-form formula for the Cauchy-Schwarz divergence between densities of an exponential family.Comment: 13 page

    Convergence of Smoothed Empirical Measures with Applications to Entropy Estimation

    Full text link
    This paper studies convergence of empirical measures smoothed by a Gaussian kernel. Specifically, consider approximating Pāˆ—NĻƒP\ast\mathcal{N}_\sigma, for NĻƒā‰œN(0,Ļƒ2Id)\mathcal{N}_\sigma\triangleq\mathcal{N}(0,\sigma^2 \mathrm{I}_d), by P^nāˆ—NĻƒ\hat{P}_n\ast\mathcal{N}_\sigma, where P^n\hat{P}_n is the empirical measure, under different statistical distances. The convergence is examined in terms of the Wasserstein distance, total variation (TV), Kullback-Leibler (KL) divergence, and Ļ‡2\chi^2-divergence. We show that the approximation error under the TV distance and 1-Wasserstein distance (W1\mathsf{W}_1) converges at rate eO(d)nāˆ’12e^{O(d)}n^{-\frac{1}{2}} in remarkable contrast to a typical nāˆ’1dn^{-\frac{1}{d}} rate for unsmoothed W1\mathsf{W}_1 (and dā‰„3d\ge 3). For the KL divergence, squared 2-Wasserstein distance (W22\mathsf{W}_2^2), and Ļ‡2\chi^2-divergence, the convergence rate is eO(d)nāˆ’1e^{O(d)}n^{-1}, but only if PP achieves finite input-output Ļ‡2\chi^2 mutual information across the additive white Gaussian noise channel. If the latter condition is not met, the rate changes to Ļ‰(nāˆ’1)\omega(n^{-1}) for the KL divergence and W22\mathsf{W}_2^2, while the Ļ‡2\chi^2-divergence becomes infinite - a curious dichotomy. As a main application we consider estimating the differential entropy h(Pāˆ—NĻƒ)h(P\ast\mathcal{N}_\sigma) in the high-dimensional regime. The distribution PP is unknown but nn i.i.d samples from it are available. We first show that any good estimator of h(Pāˆ—NĻƒ)h(P\ast\mathcal{N}_\sigma) must have sample complexity that is exponential in dd. Using the empirical approximation results we then show that the absolute-error risk of the plug-in estimator converges at the parametric rate eO(d)nāˆ’12e^{O(d)}n^{-\frac{1}{2}}, thus establishing the minimax rate-optimality of the plug-in. Numerical results that demonstrate a significant empirical superiority of the plug-in approach to general-purpose differential entropy estimators are provided.Comment: arXiv admin note: substantial text overlap with arXiv:1810.1158

    Multi-modal filtering for non-linear estimation

    Get PDF
    Multi-modal densities appear frequently in time series and practical applications. However, they are not well represented by common state estimators, such as the Extended Kalman Filter and the Unscented Kalman Filter, which additionally suffer from the fact that uncertainty is often not captured sufficiently well. This can result in incoherent and divergent tracking performance. In this paper, we address these issues by devising a non-linear filtering algorithm where densities are represented by Gaussian mixture models, whose parameters are estimated in closed form. The resulting method exhibits a superior performance on nonlinear benchmarks. Ā© 2014 IEEE

    kk-MLE: A fast algorithm for learning statistical mixture models

    Full text link
    We describe kk-MLE, a fast and efficient local search algorithm for learning finite statistical mixtures of exponential families such as Gaussian mixture models. Mixture models are traditionally learned using the expectation-maximization (EM) soft clustering technique that monotonically increases the incomplete (expected complete) likelihood. Given prescribed mixture weights, the hard clustering kk-MLE algorithm iteratively assigns data to the most likely weighted component and update the component models using Maximum Likelihood Estimators (MLEs). Using the duality between exponential families and Bregman divergences, we prove that the local convergence of the complete likelihood of kk-MLE follows directly from the convergence of a dual additively weighted Bregman hard clustering. The inner loop of kk-MLE can be implemented using any kk-means heuristic like the celebrated Lloyd's batched or Hartigan's greedy swap updates. We then show how to update the mixture weights by minimizing a cross-entropy criterion that implies to update weights by taking the relative proportion of cluster points, and reiterate the mixture parameter update and mixture weight update processes until convergence. Hard EM is interpreted as a special case of kk-MLE when both the component update and the weight update are performed successively in the inner loop. To initialize kk-MLE, we propose kk-MLE++, a careful initialization of kk-MLE guaranteeing probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201

    Multi-modal filtering for non-linear estimation

    Get PDF
    Multi-modal densities appear frequently in time series and practical applications. However, they are not well represented by common state estimators, such as the Extended Kalman Filter and the Unscented Kalman Filter, which additionally suffer from the fact that uncertainty is often not captured sufficiently well. This can result in incoherent and divergent tracking performance. In this paper, we address these issues by devising a non-linear filtering algorithm where densities are represented by Gaussian mixture models, whose parameters are estimated in closed form. The resulting method exhibits a superior performance on nonlinear benchmarks

    Estimation and control of multi-object systems with high-fidenlity sensor models: A labelled random finite set approach

    Get PDF
    Principled and novel multi-object tracking algorithms are proposed, that have the ability to optimally process realistic sensor data, by accommodating complex observational phenomena such as merged measurements and extended targets. Additionally, a sensor control scheme based on a tractable, information theoretic objective is proposed, the goal of which is to optimise tracking performance in multi-object scenarios. The concept of labelled random finite sets is adopted in the development of these new techniques
    • ā€¦
    corecore