Search CORE

1,019 research outputs found

A Comparison of Algorithms for Learning Hidden Variables in Normal Graphs

Author: Palmieri Francesco A. N.
Publication venue
Publication date: 01/01/2013
Field of study

A Bayesian factor graph reduced to normal form consists in the interconnection of diverter units (or equal constraint units) and Single-Input/Single-Output (SISO) blocks. In this framework localized adaptation rules are explicitly derived from a constrained maximum likelihood (ML) formulation and from a minimum KL-divergence criterion using KKT conditions. The learning algorithms are compared with two other updating equations based on a Viterbi-like and on a variational approximation respectively. The performance of the various algorithm is verified on synthetic data sets for various architectures. The objective of this paper is to provide the programmer with explicit algorithms for rapid deployment of Bayesian graphs in the applications.Comment: Submitted for journal publicatio

arXiv.org e-Print Archive

CiteSeerX

Archivio Istituzionale della Ricerca - Università degli Studi della Campania "Luigi Vanvitelli"

Constraining Implicit Space with Minimum Description Length: An Unsupervised Attention Mechanism across Neural Network Layers

Author: Lin Baihan
Publication venue
Publication date: 10/09/2020
Field of study

Inspired by the adaptation phenomenon of neuronal firing, we propose the regularity normalization (RN) as an unsupervised attention mechanism (UAM) which computes the statistical regularity in the implicit space of neural networks under the Minimum Description Length (MDL) principle. Treating the neural network optimization process as a partially observable model selection problem, UAM constrains the implicit space by a normalization factor, the universal code length. We compute this universal code incrementally across neural network layers and demonstrated the flexibility to include data priors such as top-down attention and other oracle information. Empirically, our approach outperforms existing normalization methods in tackling limited, imbalanced and non-stationary input distribution in image classification, classic control, procedurally-generated reinforcement learning, generative modeling, handwriting generation and question answering tasks with various neural network architectures. Lastly, UAM tracks dependency and critical learning stages across layers and recurrent time steps of deep networks

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central

Methods for Bayesian power spectrum inference with galaxy surveys

Author: Jasche Jens
Wandelt Benjamin D.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2013
Field of study

We derive and implement a full Bayesian large scale structure inference method aiming at precision recovery of the cosmological power spectrum from galaxy redshift surveys. Our approach improves over previous Bayesian methods by performing a joint inference of the three dimensional density field, the cosmological power spectrum, luminosity dependent galaxy biases and corresponding normalizations. We account for all joint and correlated uncertainties between all inferred quantities. Classes of galaxies with different biases are treated as separate sub samples. The method therefore also allows the combined analysis of more than one galaxy survey. In particular, it solves the problem of inferring the power spectrum from galaxy surveys with non-trivial survey geometries by exploring the joint posterior distribution with efficient implementations of multiple block Markov chain and Hybrid Monte Carlo methods. Our Markov sampler achieves high statistical efficiency in low signal to noise regimes by using a deterministic reversible jump algorithm. We test our method on an artificial mock galaxy survey, emulating characteristic features of the Sloan Digital Sky Survey data release 7, such as its survey geometry and luminosity dependent biases. These tests demonstrate the numerical feasibility of our large scale Bayesian inference frame work when the parameter space has millions of dimensions. The method reveals and correctly treats the anti-correlation between bias amplitudes and power spectrum, which are not taken into account in current approaches to power spectrum estimation, a 20 percent effect across large ranges in k-space. In addition, the method results in constrained realizations of density fields obtained without assuming the power spectrum or bias parameters in advance

arXiv.org e-Print Archive

Crossref

HAL-INSU