122 research outputs found
Recommended from our members
The Recurrent Temporal Discriminative Restricted Boltzmann Machines
Classification of sequence data is the topic of interest for dynamic Bayesian models and Recurrent Neural Networks (RNNs). While the former can explicitly model the temporal dependencies between class variables, the latter have a capability of learning representations. Several attempts have been made to improve performance by combining these two approaches or increasing the processing capability of the hidden units in RNNs. This often results in complex models with a large number of learning parameters. In this paper, a compact model is proposed which offers both representation learning and temporal inference of class variables by rolling Restricted Boltzmann Machines (RBMs) and class variables over time. We address the key issue of intractability in this variant of RBMs by optimising a conditional distribution, instead of a joint distribution. Experiments reported in the paper on melody modelling and optical character recognition show that the proposed model can outperform the state-of-the-art. Also, the experimental results on optical character recognition, part-of-speech tagging and text chunking demonstrate that our model is comparable to recurrent neural networks with complex memory gates while requiring far fewer parameters
Recommended from our members
Generalising the Discriminative Restricted Boltzmann Machine
We present a novel theoretical result that generalises the Discriminative Restricted Boltzmann Machine (DRBM). While originally the DRBM was defined assuming the {0, 1}-Bernoulli distribution in each of its hidden units, this result makes it possible to derive cost functions for variants of the DRBM that utilise other distributions, including some that are often encountered in the literature. This is illustrated with the Binomial and {-1, +1}-Bernoulli distributions here. We evaluate these two DRBM variants and compare them with the original one on three benchmark datasets, namely the MNIST and USPS digit classification datasets, and the 20 Newsgroups document classification dataset. Results show that each of the three compared models outperforms the remaining two in one of the three datasets, thus indicating that the proposed theoretical generalisation of the DRBM may be valuable in practice
Recommended from our members
Sequence Classification Restricted Boltzmann Machines With Gated Units
For the classification of sequential data, dynamic Bayesian networks and recurrent neural networks (RNNs) are the preferred models. While the former can explicitly model the temporal dependences between the variables, and the latter have the capability of learning representations. The recurrent temporal restricted Boltzmann machine (RTRBM) is a model that combines these two features. However, learning and inference in RTRBMs can be difficult because of the exponential nature of its gradient computations when maximizing log likelihoods. In this article, first, we address this intractability by optimizing a conditional rather than a joint probability distribution when performing sequence classification. This results in the ``sequence classification restricted Boltzmann machine'' (SCRBM). Second, we introduce gated SCRBMs (gSCRBMs), which use an information processing gate, as an integration of SCRBMs with long short-term memory (LSTM) models. In the experiments reported in this article, we evaluate the proposed models on optical character recognition, chunking, and multiresident activity recognition in smart homes. The experimental results show that gSCRBMs achieve the performance comparable to that of the state of the art in all three tasks. gSCRBMs require far fewer parameters in comparison with other recurrent networks with memory gates, in particular, LSTMs and gated recurrent units (GRUs)
Structural Restricted Boltzmann Machine for image denoising and classification
Restricted Boltzmann Machines are generative models that consist of a layer
of hidden variables connected to another layer of visible units, and they are
used to model the distribution over visible variables. In order to gain a
higher representability power, many hidden units are commonly used, which, in
combination with a large number of visible units, leads to a high number of
trainable parameters. In this work we introduce the Structural Restricted
Boltzmann Machine model, which taking advantage of the structure of the data in
hand, constrains connections of hidden units to subsets of visible units in
order to reduce significantly the number of trainable parameters, without
compromising performance. As a possible area of application, we focus on image
modelling. Based on the nature of the images, the structure of the connections
is given in terms of spatial neighbourhoods over the pixels of the image that
constitute the visible variables of the model. We conduct extensive experiments
on various image domains. Image denoising is evaluated with corrupted images
from the MNIST dataset. The generative power of our models is compared to
vanilla RBMs, as well as their classification performance, which is assessed
with five different image domains. Results show that our proposed model has a
faster and more stable training, while also obtaining better results compared
to an RBM with no constrained connections between its visible and hidden units
Distributed Parameter Estimation in Probabilistic Graphical Models
This paper presents foundational theoretical results on distributed parameter
estimation for undirected probabilistic graphical models. It introduces a
general condition on composite likelihood decompositions of these models which
guarantees the global consistency of distributed estimators, provided the local
estimators are consistent
Semi-supervised training of cell-classifier neural networks
Nowadays, microscopes used in biological research produce a huge amount of image data. Manually processing the images is a very time-consuming and resource-heavy task, so the development and implementation of new automatic systems is required. Moreover, as we have access to a large amount of unlabeled data, while labels are only available for a small subset, these novel methods should be able to process large amounts of unlabeled data with minimal manual supervision. Here, we apply neural networks to classify cells present in biological images, and show that their accuracy can be improved by using semi-supervised training algorithms
- …