10 research outputs found

    Computing Entropies With Nested Sampling

    Full text link
    The Shannon entropy, and related quantities such as mutual information, can be used to quantify uncertainty and relevance. However, in practice, it can be difficult to compute these quantities for arbitrary probability distributions, particularly if the probability mass functions or densities cannot be evaluated. This paper introduces a computational approach, based on Nested Sampling, to evaluate entropies of probability distributions that can only be sampled. I demonstrate the method on three examples: a simple gaussian example where the key quantities are available analytically; (ii) an experimental design example about scheduling observations in order to measure the period of an oscillating signal; and (iii) predicting the future from the past in a heavy-tailed scenario.Comment: Accepted for publication in Entropy. 21 pages, 3 figures. Software available at https://github.com/eggplantbren/InfoNes

    Rare Event Simulation and Splitting for Discontinuous Random Variables

    Get PDF
    Multilevel Splitting methods, also called Sequential Monte-Carlo or \emph{Subset Simulation}, are widely used methods for estimating extreme probabilities of the form P[S(U)>q]P[S(\mathbf{U}) > q] where SS is a deterministic real-valued function and U\mathbf{U} can be a random finite- or infinite-dimensional vector. Very often, X:=S(U)X := S(\mathbf{U}) is supposed to be a continuous random variable and a lot of theoretical results on the statistical behaviour of the estimator are now derived with this hypothesis. However, as soon as some threshold effect appears in SS and/or U\mathbf{U} is discrete or mixed discrete/continuous this assumption does not hold any more and the estimator is not consistent. In this paper, we study the impact of discontinuities in the \emph{cdf} of XX and present three unbiased \emph{corrected} estimators to handle them. These estimators do not require to know in advance if XX is actually discontinuous or not and become all equal if XX is continuous. Especially, one of them has the same statistical properties in any case. Efficiency is shown on a 2-D diffusive process as well as on the \emph{Boolean SATisfiability problem} (SAT).Comment: 16 pages (12 + Appendix 4 pages), 6 figure

    Unbiased and Consistent Nested Sampling via Sequential Monte Carlo

    Full text link
    We introduce a new class of sequential Monte Carlo methods called Nested Sampling via Sequential Monte Carlo (NS-SMC), which reframes the Nested Sampling method of Skilling (2006) in terms of sequential Monte Carlo techniques. This new framework allows convergence results to be obtained in the setting when Markov chain Monte Carlo (MCMC) is used to produce new samples. An additional benefit is that marginal likelihood estimates are unbiased. In contrast to NS, the analysis of NS-SMC does not require the (unrealistic) assumption that the simulated samples be independent. As the original NS algorithm is a special case of NS-SMC, this provides insights as to why NS seems to produce accurate estimates despite a typical violation of its assumptions. For applications of NS-SMC, we give advice on tuning MCMC kernels in an automated manner via a preliminary pilot run, and present a new method for appropriately choosing the number of MCMC repeats at each iteration. Finally, a numerical study is conducted where the performance of NS-SMC and temperature-annealed SMC is compared on several challenging and realistic problems. MATLAB code for our experiments is made available at https://github.com/LeahPrice/SMC-NS .Comment: 45 pages, some minor typographical errors fixed since last versio

    A randomized Multi-index sequential Monte Carlo method

    Full text link
    We consider the problem of estimating expectations with respect to a target distribution with an unknown normalizing constant, and where even the unnormalized target needs to be approximated at finite resolution. Under such an assumption, this work builds upon a recently introduced multi-index Sequential Monte Carlo (SMC) ratio estimator, which provably enjoys the complexity improvements of multi-index Monte Carlo (MIMC) and the efficiency of SMC for inference. The present work leverages a randomization strategy to remove bias entirely, which simplifies estimation substantially, particularly in the MIMC context, where the choice of index set is otherwise important. Under reasonable assumptions, the proposed method provably achieves the same canonical complexity of MSE^(-1) as the original method, but without discretization bias. It is illustrated on examples of Bayesian inverse problems.Comment: 26 pages 6 figure

    Nested Sampling for Uncertainty Quantification and Rare Event Estimation

    Full text link
    Nested Sampling is a method for computing the Bayesian evidence, also called the marginal likelihood, which is the integral of the likelihood with respect to the prior. More generally, it is a numerical probabilistic quadrature rule. The main idea of Nested Sampling is to replace a high-dimensional likelihood integral over parameter space with an integral over the unit line by employing a push-forward with respect to a suitable transformation. Practically, a set of active samples ascends the level sets of the integrand function, with the measure contraction of the super-level sets being statistically estimated. We justify the validity of this approach for integrands with non-negligible plateaus, and demonstrate Nested Sampling's practical effectiveness in estimating the (log-)probability of rare events.Comment: 24 page

    Nested Sampling Methods

    Full text link
    Nested sampling (NS) computes parameter posterior distributions and makes Bayesian model comparison computationally feasible. Its strengths are the unsupervised navigation of complex, potentially multi-modal posteriors until a well-defined termination point. A systematic literature review of nested sampling algorithms and variants is presented. We focus on complete algorithms, including solutions to likelihood-restricted prior sampling, parallelisation, termination and diagnostics. The relation between number of live points, dimensionality and computational cost is studied for two complete algorithms. A new formulation of NS is presented, which casts the parameter space exploration as a search on a tree. Previously published ways of obtaining robust error estimates and dynamic variations of the number of live points are presented as special cases of this formulation. A new on-line diagnostic test is presented based on previous insertion rank order work. The survey of nested sampling methods concludes with outlooks for future research.Comment: Updated version incorporating constructive input from four(!) positive reports (two referees, assistant editor and editor). The open-source UltraNest package and astrostatistics tutorials can be found at https://johannesbuchner.github.io/UltraNest

    Advances in Monte Carlo methodology

    Get PDF
    corecore