1,019 research outputs found
A Comparison of Algorithms for Learning Hidden Variables in Normal Graphs
A Bayesian factor graph reduced to normal form consists in the
interconnection of diverter units (or equal constraint units) and
Single-Input/Single-Output (SISO) blocks. In this framework localized
adaptation rules are explicitly derived from a constrained maximum likelihood
(ML) formulation and from a minimum KL-divergence criterion using KKT
conditions. The learning algorithms are compared with two other updating
equations based on a Viterbi-like and on a variational approximation
respectively. The performance of the various algorithm is verified on synthetic
data sets for various architectures. The objective of this paper is to provide
the programmer with explicit algorithms for rapid deployment of Bayesian graphs
in the applications.Comment: Submitted for journal publicatio
Constraining Implicit Space with Minimum Description Length: An Unsupervised Attention Mechanism across Neural Network Layers
Inspired by the adaptation phenomenon of neuronal firing, we propose the
regularity normalization (RN) as an unsupervised attention mechanism (UAM)
which computes the statistical regularity in the implicit space of neural
networks under the Minimum Description Length (MDL) principle. Treating the
neural network optimization process as a partially observable model selection
problem, UAM constrains the implicit space by a normalization factor, the
universal code length. We compute this universal code incrementally across
neural network layers and demonstrated the flexibility to include data priors
such as top-down attention and other oracle information. Empirically, our
approach outperforms existing normalization methods in tackling limited,
imbalanced and non-stationary input distribution in image classification,
classic control, procedurally-generated reinforcement learning, generative
modeling, handwriting generation and question answering tasks with various
neural network architectures. Lastly, UAM tracks dependency and critical
learning stages across layers and recurrent time steps of deep networks
Methods for Bayesian power spectrum inference with galaxy surveys
We derive and implement a full Bayesian large scale structure inference
method aiming at precision recovery of the cosmological power spectrum from
galaxy redshift surveys. Our approach improves over previous Bayesian methods
by performing a joint inference of the three dimensional density field, the
cosmological power spectrum, luminosity dependent galaxy biases and
corresponding normalizations. We account for all joint and correlated
uncertainties between all inferred quantities. Classes of galaxies with
different biases are treated as separate sub samples. The method therefore also
allows the combined analysis of more than one galaxy survey.
In particular, it solves the problem of inferring the power spectrum from
galaxy surveys with non-trivial survey geometries by exploring the joint
posterior distribution with efficient implementations of multiple block Markov
chain and Hybrid Monte Carlo methods. Our Markov sampler achieves high
statistical efficiency in low signal to noise regimes by using a deterministic
reversible jump algorithm. We test our method on an artificial mock galaxy
survey, emulating characteristic features of the Sloan Digital Sky Survey data
release 7, such as its survey geometry and luminosity dependent biases. These
tests demonstrate the numerical feasibility of our large scale Bayesian
inference frame work when the parameter space has millions of dimensions.
The method reveals and correctly treats the anti-correlation between bias
amplitudes and power spectrum, which are not taken into account in current
approaches to power spectrum estimation, a 20 percent effect across large
ranges in k-space. In addition, the method results in constrained realizations
of density fields obtained without assuming the power spectrum or bias
parameters in advance
- …