390 research outputs found
The Forward-Forward Algorithm: Some Preliminary Investigations
The aim of this paper is to introduce a new learning procedure for neural
networks and to demonstrate that it works well enough on a few small problems
to be worth further investigation. The Forward-Forward algorithm replaces the
forward and backward passes of backpropagation by two forward passes, one with
positive (i.e. real) data and the other with negative data which could be
generated by the network itself. Each layer has its own objective function
which is simply to have high goodness for positive data and low goodness for
negative data. The sum of the squared activities in a layer can be used as the
goodness but there are many other possibilities, including minus the sum of the
squared activities. If the positive and negative passes could be separated in
time, the negative passes could be done offline, which would make the learning
much simpler in the positive pass and allow video to be pipelined through the
network without ever storing activities or stopping to propagate derivatives
Conditional Restricted Boltzmann Machines for Structured Output Prediction
Conditional Restricted Boltzmann Machines (CRBMs) are rich probabilistic
models that have recently been applied to a wide range of problems, including
collaborative filtering, classification, and modeling motion capture data.
While much progress has been made in training non-conditional RBMs, these
algorithms are not applicable to conditional models and there has been almost
no work on training and generating predictions from conditional RBMs for
structured output problems. We first argue that standard Contrastive
Divergence-based learning may not be suitable for training CRBMs. We then
identify two distinct types of structured output prediction problems and
propose an improved learning algorithm for each. The first problem type is one
where the output space has arbitrary structure but the set of likely output
configurations is relatively small, such as in multi-label classification. The
second problem is one where the output space is arbitrarily structured but
where the output space variability is much greater, such as in image denoising
or pixel labeling. We show that the new learning algorithms can work much
better than Contrastive Divergence on both types of problems
- …