9,540 research outputs found
An Efficient Learning Procedure for Deep Boltzmann Machines
We present a new learning algorithm for Boltzmann Machines that contain many layers of hidden variables. Data-dependent statistics are estimated using a variational approximation that tends to focus on a single mode, and data-independent statistics are estimated using persistent Markov chains. The use of two quite different techniques for estimating the two types of statistic that enter into the gradient of the log likelihood makes it practical to learn Boltzmann Machines with multiple hidden layers and millions of parameters. The learning can be made more efficient by using a layer-by-layer "pre-training" phase that initializes the weights sensibly. The pre-training also allows the variational inference to be initialized sensibly with a single bottom-up pass. We present results on the MNIST and NORB datasets showing that Deep Boltzmann Machines learn very good generative models of hand-written digits and 3-D objects. We also show that the features discovered by Deep Boltzmann Machines are a very effective way to initialize the hidden layers of feed-forward neural nets which are then discriminatively fine-tuned
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Neural-Network Quantum States, String-Bond States, and Chiral Topological States
Neural-Network Quantum States have been recently introduced as an Ansatz for
describing the wave function of quantum many-body systems. We show that there
are strong connections between Neural-Network Quantum States in the form of
Restricted Boltzmann Machines and some classes of Tensor-Network states in
arbitrary dimensions. In particular we demonstrate that short-range Restricted
Boltzmann Machines are Entangled Plaquette States, while fully connected
Restricted Boltzmann Machines are String-Bond States with a nonlocal geometry
and low bond dimension. These results shed light on the underlying architecture
of Restricted Boltzmann Machines and their efficiency at representing many-body
quantum states. String-Bond States also provide a generic way of enhancing the
power of Neural-Network Quantum States and a natural generalization to systems
with larger local Hilbert space. We compare the advantages and drawbacks of
these different classes of states and present a method to combine them
together. This allows us to benefit from both the entanglement structure of
Tensor Networks and the efficiency of Neural-Network Quantum States into a
single Ansatz capable of targeting the wave function of strongly correlated
systems. While it remains a challenge to describe states with chiral
topological order using traditional Tensor Networks, we show that
Neural-Network Quantum States and their String-Bond States extension can
describe a lattice Fractional Quantum Hall state exactly. In addition, we
provide numerical evidence that Neural-Network Quantum States can approximate a
chiral spin liquid with better accuracy than Entangled Plaquette States and
local String-Bond States. Our results demonstrate the efficiency of neural
networks to describe complex quantum wave functions and pave the way towards
the use of String-Bond States as a tool in more traditional machine-learning
applications.Comment: 15 pages, 7 figure
- …