19,820 research outputs found
Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines
This paper introduces the Metric-Free Natural Gradient (MFNG) algorithm for
training Boltzmann Machines. Similar in spirit to the Hessian-Free method of
Martens [8], our algorithm belongs to the family of truncated Newton methods
and exploits an efficient matrix-vector product to avoid explicitely storing
the natural gradient metric . This metric is shown to be the expected second
derivative of the log-partition function (under the model distribution), or
equivalently, the variance of the vector of partial derivatives of the energy
function. We evaluate our method on the task of joint-training a 3-layer Deep
Boltzmann Machine and show that MFNG does indeed have faster per-epoch
convergence compared to Stochastic Maximum Likelihood with centering, though
wall-clock performance is currently not competitive
Classification of Occluded Objects using Fast Recurrent Processing
Recurrent neural networks are powerful tools for handling incomplete data
problems in computer vision, thanks to their significant generative
capabilities. However, the computational demand for these algorithms is too
high to work in real time, without specialized hardware or software solutions.
In this paper, we propose a framework for augmenting recurrent processing
capabilities into a feedforward network without sacrificing much from
computational efficiency. We assume a mixture model and generate samples of the
last hidden layer according to the class decisions of the output layer, modify
the hidden layer activity using the samples, and propagate to lower layers. For
visual occlusion problem, the iterative procedure emulates feedforward-feedback
loop, filling-in the missing hidden layer activity with meaningful
representations. The proposed algorithm is tested on a widely used dataset, and
shown to achieve 2 improvement in classification accuracy for occluded
objects. When compared to Restricted Boltzmann Machines, our algorithm shows
superior performance for occluded object classification.Comment: arXiv admin note: text overlap with arXiv:1409.8576 by other author
Large-scale grid-enabled lattice-Boltzmann simulations of complex fluid flow in porous media and under shear
Well designed lattice-Boltzmann codes exploit the essentially embarrassingly
parallel features of the algorithm and so can be run with considerable
efficiency on modern supercomputers. Such scalable codes permit us to simulate
the behaviour of increasingly large quantities of complex condensed matter
systems. In the present paper, we present some preliminary results on the large
scale three-dimensional lattice-Boltzmann simulation of binary immiscible fluid
flows through a porous medium derived from digitised x-ray microtomographic
data of Bentheimer sandstone, and from the study of the same fluids under
shear. Simulations on such scales can benefit considerably from the use of
computational steering and we describe our implementation of steering within
the lattice-Boltzmann code, called LB3D, making use of the RealityGrid steering
library. Our large scale simulations benefit from the new concept of capability
computing, designed to prioritise the execution of big jobs on major
supercomputing resources. The advent of persistent computational grids promises
to provide an optimal environment in which to deploy these mesoscale simulation
methods, which can exploit the distributed nature of compute, visualisation and
storage resources to reach scientific results rapidly; we discuss our work on
the grid-enablement of lattice-Boltzmann methods in this context.Comment: 17 pages, 6 figures, accepted for publication in
Phil.Trans.R.Soc.Lond.
Weighted Contrastive Divergence
Learning algorithms for energy based Boltzmann architectures that rely on
gradient descent are in general computationally prohibitive, typically due to
the exponential number of terms involved in computing the partition function.
In this way one has to resort to approximation schemes for the evaluation of
the gradient. This is the case of Restricted Boltzmann Machines (RBM) and its
learning algorithm Contrastive Divergence (CD). It is well-known that CD has a
number of shortcomings, and its approximation to the gradient has several
drawbacks. Overcoming these defects has been the basis of much research and new
algorithms have been devised, such as persistent CD. In this manuscript we
propose a new algorithm that we call Weighted CD (WCD), built from small
modifications of the negative phase in standard CD. However small these
modifications may be, experimental work reported in this paper suggest that WCD
provides a significant improvement over standard CD and persistent CD at a
small additional computational cost
- …