17 research outputs found
Information Gains from Cosmological Probes
In light of the growing number of cosmological observations, it is important
to develop versatile tools to quantify the constraining power and consistency
of cosmological probes. Originally motivated from information theory, we use
the relative entropy to compute the information gained by Bayesian updates in
units of bits. This measure quantifies both the improvement in precision and
the 'surprise', i.e. the tension arising from shifts in central values. Our
starting point is a WMAP9 prior which we update with observations of the
distance ladder, supernovae (SNe), baryon acoustic oscillations (BAO), and weak
lensing as well as the 2015 Planck release. We consider the parameters of the
flat CDM concordance model and some of its extensions which include
curvature and Dark Energy equation of state parameter . We find that,
relative to WMAP9 and within these model spaces, the probes that have provided
the greatest gains are Planck (10 bits), followed by BAO surveys (5.1 bits) and
SNe experiments (3.1 bits). The other cosmological probes, including weak
lensing (1.7 bits) and {} measures (1.7 bits), have contributed
information but at a lower level. Furthermore, we do not find any significant
surprise when updating the constraints of WMAP9 with any of the other
experiments, meaning that they are consistent with WMAP9. However, when we
choose Planck15 as the prior, we find that, accounting for the full
multi-dimensionality of the parameter space, the weak lensing measurements of
CFHTLenS produce a large surprise of 4.4 bits which is statistically
significant at the 8 level. We discuss how the relative entropy
provides a versatile and robust framework to compare cosmological probes in the
context of current and future surveys.Comment: 26 pages, 5 figure
Asymptotically Unbiased Estimator of the Informational Energy with kNN
Motivated by machine learning applications (e.g., classification, functionĀ approximation, feature extraction), in previous work, we have introduced a nonparametricĀ estimator of Onicescuās informational energy. Our method was based onĀ the k-th nearest neighbor distances between the n sample points, where k is a fixedĀ positive integer. In the present contribution, we discuss mathematical properties ofĀ this estimator. We show that our estimator is asymptotically unbiased and consistent.Ā We provide further experimental results which illustrate the convergence of theĀ estimator for standard distributions
On accuracy of PDF divergence estimators and their applicability to representative data sampling
Generalisation error estimation is an important issue in machine learning. Cross-validation traditionally used for this purpose requires building multiple models and repeating the whole procedure many times in order to produce reliable error estimates. It is however possible to accurately estimate the error using only a single model, if the training and test data are chosen appropriately. This paper investigates the possibility of using various probability density function divergence measures for the purpose of representative data sampling. As it turned out, the first difficulty one needs to deal with is estimation of the divergence itself. In contrast to other publications on this subject, the experimental results provided in this study show that in many cases it is not possible unless samples consisting of thousands of instances are used. Exhaustive experiments on the divergence guided representative data sampling have been performed using 26 publicly available benchmark datasets and 70 PDF divergence estimators, and their results have been analysed and discussed
A Nearest-Neighbor Approach to Estimating Divergence between Continuous Random Vectors
A method for divergence estimation between multidimensional distributions based on nearest neighbor distances is proposed. Given i.i.d. samples, both the bias and the variance of this estimator are proven to vanish as sample sizes go to infinity. In experiments on high-dimensional data, the nearest neighbor approach generally exhibits faster convergence compared to previous algorithms based on partitioning