1,473 research outputs found

    On the Equivalence Between Deep NADE and Generative Stochastic Networks

    Full text link
    Neural Autoregressive Distribution Estimators (NADEs) have recently been shown as successful alternatives for modeling high dimensional multimodal distributions. One issue associated with NADEs is that they rely on a particular order of factorization for P(x)P(\mathbf{x}). This issue has been recently addressed by a variant of NADE called Orderless NADEs and its deeper version, Deep Orderless NADE. Orderless NADEs are trained based on a criterion that stochastically maximizes P(x)P(\mathbf{x}) with all possible orders of factorizations. Unfortunately, ancestral sampling from deep NADE is very expensive, corresponding to running through a neural net separately predicting each of the visible variables given some others. This work makes a connection between this criterion and the training criterion for Generative Stochastic Networks (GSNs). It shows that training NADEs in this way also trains a GSN, which defines a Markov chain associated with the NADE model. Based on this connection, we show an alternative way to sample from a trained Orderless NADE that allows to trade-off computing time and quality of the samples: a 3 to 10-fold speedup (taking into account the waste due to correlations between consecutive samples of the chain) can be obtained without noticeably reducing the quality of the samples. This is achieved using a novel sampling procedure for GSNs called annealed GSN sampling, similar to tempering methods that combines fast mixing (obtained thanks to steps at high noise levels) with accurate samples (obtained thanks to steps at low noise levels).Comment: ECML/PKDD 201

    Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines

    Full text link
    This paper introduces the Metric-Free Natural Gradient (MFNG) algorithm for training Boltzmann Machines. Similar in spirit to the Hessian-Free method of Martens [8], our algorithm belongs to the family of truncated Newton methods and exploits an efficient matrix-vector product to avoid explicitely storing the natural gradient metric LL. This metric is shown to be the expected second derivative of the log-partition function (under the model distribution), or equivalently, the variance of the vector of partial derivatives of the energy function. We evaluate our method on the task of joint-training a 3-layer Deep Boltzmann Machine and show that MFNG does indeed have faster per-epoch convergence compared to Stochastic Maximum Likelihood with centering, though wall-clock performance is currently not competitive

    Pruning Boltzmann networks and hidden Markov models

    Get PDF

    Extreme Quantum Advantage for Rare-Event Sampling

    Get PDF
    We introduce a quantum algorithm for efficient biased sampling of the rare events generated by classical memoryful stochastic processes. We show that this quantum algorithm gives an extreme advantage over known classical biased sampling algorithms in terms of the memory resources required. The quantum memory advantage ranges from polynomial to exponential and when sampling the rare equilibrium configurations of spin systems the quantum advantage diverges.Comment: 11 pages, 9 figures; http://csc.ucdavis.edu/~cmg/compmech/pubs/eqafbs.ht

    Sparse Nested Markov models with Log-linear Parameters

    Full text link
    Hidden variables are ubiquitous in practical data analysis, and therefore modeling marginal densities and doing inference with the resulting models is an important problem in statistics, machine learning, and causal inference. Recently, a new type of graphical model, called the nested Markov model, was developed which captures equality constraints found in marginals of directed acyclic graph (DAG) models. Some of these constraints, such as the so called `Verma constraint', strictly generalize conditional independence. To make modeling and inference with nested Markov models practical, it is necessary to limit the number of parameters in the model, while still correctly capturing the constraints in the marginal of a DAG model. Placing such limits is similar in spirit to sparsity methods for undirected graphical models, and regression models. In this paper, we give a log-linear parameterization which allows sparse modeling with nested Markov models. We illustrate the advantages of this parameterization with a simulation study.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013

    Neural-Network Quantum States, String-Bond States, and Chiral Topological States

    Full text link
    Neural-Network Quantum States have been recently introduced as an Ansatz for describing the wave function of quantum many-body systems. We show that there are strong connections between Neural-Network Quantum States in the form of Restricted Boltzmann Machines and some classes of Tensor-Network states in arbitrary dimensions. In particular we demonstrate that short-range Restricted Boltzmann Machines are Entangled Plaquette States, while fully connected Restricted Boltzmann Machines are String-Bond States with a nonlocal geometry and low bond dimension. These results shed light on the underlying architecture of Restricted Boltzmann Machines and their efficiency at representing many-body quantum states. String-Bond States also provide a generic way of enhancing the power of Neural-Network Quantum States and a natural generalization to systems with larger local Hilbert space. We compare the advantages and drawbacks of these different classes of states and present a method to combine them together. This allows us to benefit from both the entanglement structure of Tensor Networks and the efficiency of Neural-Network Quantum States into a single Ansatz capable of targeting the wave function of strongly correlated systems. While it remains a challenge to describe states with chiral topological order using traditional Tensor Networks, we show that Neural-Network Quantum States and their String-Bond States extension can describe a lattice Fractional Quantum Hall state exactly. In addition, we provide numerical evidence that Neural-Network Quantum States can approximate a chiral spin liquid with better accuracy than Entangled Plaquette States and local String-Bond States. Our results demonstrate the efficiency of neural networks to describe complex quantum wave functions and pave the way towards the use of String-Bond States as a tool in more traditional machine-learning applications.Comment: 15 pages, 7 figure
    • …
    corecore