Search CORE

8,389 research outputs found

Dirichlet Bayesian Network Scores and the Maximum Relative Entropy Principle

Author: Scutari Marco
Publication venue
Publication date: 01/01/2018
Field of study

A classic approach for learning Bayesian networks from data is to identify a maximum a posteriori (MAP) network structure. In the case of discrete Bayesian networks, MAP networks are selected by maximising one of several possible Bayesian Dirichlet (BD) scores; the most famous is the Bayesian Dirichlet equivalent uniform (BDeu) score from Heckerman et al (1995). The key properties of BDeu arise from its uniform prior over the parameters of each local distribution in the network, which makes structure learning computationally efficient; it does not require the elicitation of prior knowledge from experts; and it satisfies score equivalence. In this paper we will review the derivation and the properties of BD scores, and of BDeu in particular, and we will link them to the corresponding entropy estimates to study them from an information theoretic perspective. To this end, we will work in the context of the foundational work of Giffin and Caticha (2007), who showed that Bayesian inference can be framed as a particular case of the maximum relative entropy principle. We will use this connection to show that BDeu should not be used for structure learning from sparse data, since it violates the maximum relative entropy principle; and that it is also problematic from a more classic Bayesian model selection perspective, because it produces Bayes factors that are sensitive to the value of its only hyperparameter. Using a large simulation study, we found in our previous work (Scutari, 2016) that the Bayesian Dirichlet sparse (BDs) score seems to provide better accuracy in structure learning; in this paper we further show that BDs does not suffer from the issues above, and we recommend to use it for sparse data instead of BDeu. Finally, will show that these issues are in fact different aspects of the same problem and a consequence of the distributional assumptions of the prior.Comment: 20 pages, 4 figures; extended version submitted to Behaviormetrik

arXiv.org e-Print Archive

Oxford University Research Archive

Combinatorial Information Theory: I. Philosophical Basis of Cross-Entropy and Entropy

Author: Niven Robert K.
Publication venue
Publication date: 01/01/2007
Field of study

This study critically analyses the information-theoretic, axiomatic and combinatorial philosophical bases of the entropy and cross-entropy concepts. The combinatorial basis is shown to be the most fundamental (most primitive) of these three bases, since it gives (i) a derivation for the Kullback-Leibler cross-entropy and Shannon entropy functions, as simplified forms of the multinomial distribution subject to the Stirling approximation; (ii) an explanation for the need to maximize entropy (or minimize cross-entropy) to find the most probable realization; and (iii) new, generalized definitions of entropy and cross-entropy - supersets of the Boltzmann principle - applicable to non-multinomial systems. The combinatorial basis is therefore of much broader scope, with far greater power of application, than the information-theoretic and axiomatic bases. The generalized definitions underpin a new discipline of ``{\it combinatorial information theory}'', for the analysis of probabilistic systems of any type. Jaynes' generic formulation of statistical mechanics for multinomial systems is re-examined in light of the combinatorial approach. (abbreviated abstract)Comment: 45 pp; 1 figure; REVTex; updated version 5 (incremental changes

arXiv.org e-Print Archive

CiteSeerX

How to estimate the differential acceleration in a two-species atom interferometer to test the equivalence principle

Author: Bouyer Philippe
Cheinet Patrick
Geiger Rémi
Landragin Arnaud
Nyman Robert A.
Varoquaux Gael
Publication venue: 'American Physical Society (APS)'
Publication date: 13/10/2009
Field of study

We propose a scheme for testing the weak equivalence principle (Universality of Free Fall) using an atom-interferometric measurement of the local differential acceleration between two atomic species with a large mass ratio as test masses. A apparatus in free fall can be used to track atomic free-fall trajectories over large distances. We show how the differential acceleration can be extracted from the interferometric signal using Bayesian statistical estimation, even in the case of a large mass and laser wavelength difference. We show that this statistical estimation method does not suffer from acceleration noise of the platform and does not require repeatable experimental conditions. We specialize our discussion to a dual potassium/rubidium interferometer and extend our protocol with other atomic mixtures. Finally, we discuss the performances of the UFF test developed for the free-fall (0-g) airplane in the ICE project (\verb"http://www.ice-space.fr"

arXiv.org e-Print Archive

HAL-INSU

HAL-OBSPM

Learning the Irreducible Representations of Commutative Lie Groups

Author: Cohen Taco
Welling Max
Publication venue
Publication date: 01/01/2014
Field of study

We present a new probabilistic model of compact commutative Lie groups that produces invariant-equivariant and disentangled representations of data. To define the notion of disentangling, we borrow a fundamental principle from physics that is used to derive the elementary particles of a system from its symmetries. Our model employs a newfound Bayesian conjugacy relation that enables fully tractable probabilistic inference over compact commutative Lie groups -- a class that includes the groups that describe the rotation and cyclic translation of images. We train the model on pairs of transformed image patches, and show that the learned invariant representation is highly effective for classification

arXiv.org e-Print Archive

CiteSeerX

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Bayesian reconstruction of the cosmological large-scale structure: methodology, inverse algorithms and numerical optimization

Author: Albrecht
Aster
Bakushinskii
Ballinger
Bardeen
Bardeen
Berger
Bertschinger
Bistolas
Blaschke
Bunn
Bunn
Carasso
Cooray
Dekel
Doré
Efstathiou
Eisenstein
Erdoğdu
Erdoğdu
Eriksen
F. S. Kitaura
Fisher
Fisher
Fletcher
Foster
Frieden
Ganon
Gelman
Geman
Gordon
Gull
Gull
Gull
Guth
Guth
Hamilton
Hanke
Hanke
Hastings
Hawking
Hestenes
Hobson
Hobson
Hockney
Hoerl
Hoerl
Hoffman
Hoffman
Janssen
Jaynes
Jaynes
Jewell
Kaiser
Kaiser
Keihänen
Kibble
Komatsu
Lahav
Lahav
Lahav
Larson
Linde
Lucy
Maisinger
Marchuk
Metropolis
Molina
Narayan
Natoli
Neal
Nusser
O'Dwyer
O'Sullivan
Peacock
Pen
Percival
Pierpaoli
Polak
Press
Richardson
Robert
Robinson
Rybicki
Schmoldt
Scoccimarro
Seljak
Shannon
Shepp
Shewchuk
Skilling
Smith
Starobinsky
Stompor
Sutton
T. A. Enßlin
Tanner
Tegmark
Tegmark
Tikhonov
Vogeley
Wandelt
Webster
Yahil
Yvon
Zaroubi
Zaroubi
Zaroubi
Publication venue: 'Wiley'
Publication date: 01/01/2008
Field of study

We address the inverse problem of cosmic large-scale structure reconstruction from a Bayesian perspective. For a linear data model, a number of known and novel reconstruction schemes, which differ in terms of the underlying signal prior, data likelihood, and numerical inverse extra-regularization schemes are derived and classified. The Bayesian methodology presented in this paper tries to unify and extend the following methods: Wiener-filtering, Tikhonov regularization, Ridge regression, Maximum Entropy, and inverse regularization techniques. The inverse techniques considered here are the asymptotic regularization, the Jacobi, Steepest Descent, Newton-Raphson, Landweber-Fridman, and both linear and non-linear Krylov methods based on Fletcher-Reeves, Polak-Ribiere, and Hestenes-Stiefel Conjugate Gradients. The structures of the up-to-date highest-performing algorithms are presented, based on an operator scheme, which permits one to exploit the power of fast Fourier transforms. Using such an implementation of the generalized Wiener-filter in the novel ARGO-software package, the different numerical schemes are benchmarked with 1-, 2-, and 3-dimensional problems including structured white and Poissonian noise, data windowing and blurring effects. A novel numerical Krylov scheme is shown to be superior in terms of performance and fidelity. These fast inverse methods ultimately will enable the application of sampling techniques to explore complex joint posterior distributions. We outline how the space of the dark-matter density field, the peculiar velocity field, and the power spectrum can jointly be investigated by a Gibbs-sampling process. Such a method can be applied for the redshift distortions correction of the observed galaxies and for time-reversal reconstructions of the initial density field.Comment: 40 pages, 11 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

MPG.PuRe

Model Selection Principles in Misspecified Models

Author: Akaike
Akaike
Bogdan
Bozdogan
Bozdogan
Burnham
Casella
Cavanaugh
Chen
DasGupta
Efron
Fahrmeir
Fan
Fan
Fan
Fan
Foster
Gelman
Hall
Hosking
Konishi
Kullback
Liu
Lv
Lv
McCullagh
Schwarz
Shibata
Spiegelhalter
Stone
Takeuchi
Tian
Wang
White
Yang
Z·ak-Szatkowska
Publication venue: 'Wiley'
Publication date: 11/05/2016
Field of study

Model selection is of fundamental importance to high dimensional modeling featured in many contemporary applications. Classical principles of model selection include the Kullback-Leibler divergence principle and the Bayesian principle, which lead to the Akaike information criterion and Bayesian information criterion when models are correctly specified. Yet model misspecification is unavoidable when we have no knowledge of the true model or when we have the correct family of distributions but miss some true predictor. In this paper, we propose a family of semi-Bayesian principles for model selection in misspecified models, which combine the strengths of the two well-known principles. We derive asymptotic expansions of the semi-Bayesian principles in misspecified generalized linear models, which give the new semi-Bayesian information criteria (SIC). A specific form of SIC admits a natural decomposition into the negative maximum quasi-log-likelihood, a penalty on model dimensionality, and a penalty on model misspecification directly. Numerical studies demonstrate the advantage of the newly proposed SIC methodology for model selection in both correctly specified and misspecified models.Comment: 25 pages, 6 table

arXiv.org e-Print Archive

CiteSeerX

Crossref

Robust adaptive beamforming using a Bayesian steering vector error model

Author: Besson Olivier
Bidon Stéphanie
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

We propose a Bayesian approach to robust adaptive beamforming which entails considering the steering vector of interest as a random variable with some prior distribution. The latter can be tuned in a simple way to reflect how far is the actual steering vector from its presumed value. Two different priors are proposed, namely a Bingham prior distribution and a distribution that directly reveals and depends upon the angle between the true and presumed steering vector. Accordingly, a non-informative prior is assigned to the interference plus noise covariance matrix R, which can be viewed as a means to introduce diagonal loading in a Bayesian framework. The minimum mean square distance estimate of the steering vector as well as the minimum mean square error estimate of R are derived and implemented using a Gibbs sampling strategy. Numerical simulations show that the new beamformers possess a very good rate of convergence even in the presence of steering vector errors

Open Archive Toulouse Archive Ouverte