3,646 research outputs found
On Measure Concentration of Random Maximum A-Posteriori Perturbations
The maximum a-posteriori (MAP) perturbation framework has emerged as a useful
approach for inference and learning in high dimensional complex models. By
maximizing a randomly perturbed potential function, MAP perturbations generate
unbiased samples from the Gibbs distribution. Unfortunately, the computational
cost of generating so many high-dimensional random variables can be
prohibitive. More efficient algorithms use sequential sampling strategies based
on the expected value of low dimensional MAP perturbations. This paper develops
new measure concentration inequalities that bound the number of samples needed
to estimate such expected values. Applying the general result to MAP
perturbations can yield a more efficient algorithm to approximate sampling from
the Gibbs distribution. The measure concentration result is of general interest
and may be applicable to other areas involving expected estimations
On Sampling from the Gibbs Distribution with Random Maximum A-Posteriori Perturbations
In this paper we describe how MAP inference can be used to sample efficiently
from Gibbs distributions. Specifically, we provide means for drawing either
approximate or unbiased samples from Gibbs' distributions by introducing low
dimensional perturbations and solving the corresponding MAP assignments. Our
approach also leads to new ways to derive lower bounds on partition functions.
We demonstrate empirically that our method excels in the typical "high signal -
high coupling" regime. The setting results in ragged energy landscapes that are
challenging for alternative approaches to sampling and/or lower bounds
An Efficient Algorithm for Upper Bound on the Partition Function of Nucleic Acids
It has been shown that minimum free energy structure for RNAs and RNA-RNA
interaction is often incorrect due to inaccuracies in the energy parameters and
inherent limitations of the energy model. In contrast, ensemble based
quantities such as melting temperature and equilibrium concentrations can be
more reliably predicted. Even structure prediction by sampling from the
ensemble and clustering those structures by Sfold [7] has proven to be more
reliable than minimum free energy structure prediction. The main obstacle for
ensemble based approaches is the computational complexity of the partition
function and base pairing probabilities. For instance, the space complexity of
the partition function for RNA-RNA interaction is and the time
complexity is which are prohibitively large [4,12]. Our goal in this
paper is to give a fast algorithm, based on sparse folding, to calculate an
upper bound on the partition function. Our work is based on the recent
algorithm of Hazan and Jaakkola [10]. The space complexity of our algorithm is
the same as that of sparse folding algorithms, and the time complexity of our
algorithm is for single RNA and for RNA-RNA
interaction in practice, in which is the running time of sparse folding
and () is a sequence dependent parameter
Integrated Information in Discrete Dynamical Systems: Motivation and Theoretical Framework
This paper introduces a time- and state-dependent measure of integrated information, φ, which captures the repertoire of causal states available to a system as a whole. Specifically, φ quantifies how much information is generated (uncertainty is reduced) when a system enters a particular state through causal interactions among its elements, above and beyond the information generated independently by its parts. Such mathematical characterization is motivated by the observation that integrated information captures two key phenomenological properties of consciousness: (i) there is a large repertoire of conscious experiences so that, when one particular experience occurs, it generates a large amount of information by ruling out all the others; and (ii) this information is integrated, in that each experience appears as a whole that cannot be decomposed into independent parts. This paper extends previous work on stationary systems and applies integrated information to discrete networks as a function of their dynamics and causal architecture. An analysis of basic examples indicates the following: (i) φ varies depending on the state entered by a network, being higher if active and inactive elements are balanced and lower if the network is inactive or hyperactive. (ii) φ varies for systems with identical or similar surface dynamics depending on the underlying causal architecture, being low for systems that merely copy or replay activity states. (iii) φ varies as a function of network architecture. High φ values can be obtained by architectures that conjoin functional specialization with functional integration. Strictly modular and homogeneous systems cannot generate high φ because the former lack integration, whereas the latter lack information. Feedforward and lattice architectures are capable of generating high φ but are inefficient. (iv) In Hopfield networks, φ is low for attractor states and neutral states, but increases if the networks are optimized to achieve tension between local and global interactions. These basic examples appear to match well against neurobiological evidence concerning the neural substrates of consciousness. More generally, φ appears to be a useful metric to characterize the capacity of any physical system to integrate information
Lost Relatives of the Gumbel Trick
The Gumbel trick is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function. The method re- lies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration. We derive an entire family of related methods, of which the Gumbel trick is one member, and show that the new methods have superior properties in several settings with minimal additional computational cost. In particular, for the Gumbel trick to yield computational benefits for discrete graphical models, Gumbel perturbations on all configurations are typically replaced with so- called low-rank perturbations. We show how a subfamily of our new methods adapts to this set- ting, proving new upper and lower bounds on the log partition function and deriving a family of sequential samplers for the Gibbs distribution. Finally, we balance the discussion by showing how the simpler analytical form of the Gumbel trick enables additional theoretical results.Alan Turing Institute under EPSRC grant EP/N510129/1, and by the Leverhulme Trust via the CFI
- …