232 research outputs found
Lifelong Generative Modeling
Lifelong learning is the problem of learning multiple consecutive tasks in a
sequential manner, where knowledge gained from previous tasks is retained and
used to aid future learning over the lifetime of the learner. It is essential
towards the development of intelligent machines that can adapt to their
surroundings. In this work we focus on a lifelong learning approach to
unsupervised generative modeling, where we continuously incorporate newly
observed distributions into a learned model. We do so through a student-teacher
Variational Autoencoder architecture which allows us to learn and preserve all
the distributions seen so far, without the need to retain the past data nor the
past models. Through the introduction of a novel cross-model regularizer,
inspired by a Bayesian update rule, the student model leverages the information
learned by the teacher, which acts as a probabilistic knowledge store. The
regularizer reduces the effect of catastrophic interference that appears when
we learn over sequences of distributions. We validate our model's performance
on sequential variants of MNIST, FashionMNIST, PermutedMNIST, SVHN and Celeb-A
and demonstrate that our model mitigates the effects of catastrophic
interference faced by neural networks in sequential learning scenarios.Comment: 32 page
Meta-Learning Evolutionary Artificial Neural Networks
In this paper, we present MLEANN (Meta-Learning Evolutionary Artificial
Neural Network), an automatic computational framework for the adaptive
optimization of artificial neural networks wherein the neural network
architecture, activation function, connection weights; learning algorithm and
its parameters are adapted according to the problem. We explored the
performance of MLEANN and conventionally designed artificial neural networks
for function approximation problems. To evaluate the comparative performance,
we used three different well-known chaotic time series. We also present the
state of the art popular neural network learning algorithms and some
experimentation results related to convergence speed and generalization
performance. We explored the performance of backpropagation algorithm;
conjugate gradient algorithm, quasi-Newton algorithm and Levenberg-Marquardt
algorithm for the three chaotic time series. Performances of the different
learning algorithms were evaluated when the activation functions and
architecture were changed. We further present the theoretical background,
algorithm, design strategy and further demonstrate how effective and inevitable
is the proposed MLEANN framework to design a neural network, which is smaller,
faster and with a better generalization performance
Fast Learning by Bounding Likelihoods in Sigmoid Type Belief Networks
Sigmoid type belief networks, a class of probabilistic neural networks, provide a natural framework for compactly representing probabilistic information in a variety of unsupervised and supervised learning problems. Often the parameters used in these networks need to be learned from examples. Unfortunately, estimating the parameters via exact probabilistic calculations (i.e, the EM-algorithm) is intractable even for networks with fairly small numbers of hidden units. We propose to avoid the infeasibility of the E step by bounding likelihoods instead of computing them exactly. We introduce extended and complementary representations for these networks and show that the estimation of the network parameters can be made fast (reduced to quadratic optimization) by performing the estimation in either of the alternative domains. The complementary networks can be used for continuous density estimation as well
Mean Field Methods for a Special Class of Belief Networks
The chief aim of this paper is to propose mean-field approximations for a
broad class of Belief networks, of which sigmoid and noisy-or networks can be
seen as special cases. The approximations are based on a powerful mean-field
theory suggested by Plefka. We show that Saul, Jaakkola and Jordan' s approach
is the first order approximation in Plefka's approach, via a variational
derivation. The application of Plefka's theory to belief networks is not
computationally tractable. To tackle this problem we propose new approximations
based on Taylor series. Small scale experiments show that the proposed schemes
are attractive
An Artificial Neural Network technique for on-line hotel booking
In this paper the use of Artificial Neural Networks (ANNs) in on-line booking for hotel industry is investigated. The paper details the description, the modeling and the resolution technique of on-line booking. The latter problem is modeled using the paradigms of machine learning, in place of standard `If-Then-Else' chains of conditional rules. In particular, a supervised three layers MLP neural network is adopted, which is trained using information from previous customers' reservations. Performance of our ANN is analyzed: it behaves in a quite satisfactory way in managing the (simulated) booking service in a hotel. The customer requires single or double rooms, while the system gives as a reply the confirmation of the required services, if available. Moreover, we highlight that using our approach the system proposes alternative accommodations (from two days in advance to two days later with respect to the requested day), in case rooms or services are not available. Numerical results are given, where the effectiveness of the proposed approach is critically analyzed. Finally, we outline guidelines for future research.On-line booking; hotel reservation; machine learning; supervised multilayer perceptron networks
Cardinality-Minimal Explanations for Monotonic Neural Networks
In recent years, there has been increasing interest in explanation methods
for neural model predictions that offer precise formal guarantees. These
include abductive (respectively, contrastive) methods, which aim to compute
minimal subsets of input features that are sufficient for a given prediction to
hold (respectively, to change a given prediction). The corresponding decision
problems are, however, known to be intractable. In this paper, we investigate
whether tractability can be regained by focusing on neural models implementing
a monotonic function. Although the relevant decision problems remain
intractable, we can show that they become solvable in polynomial time by means
of greedy algorithms if we additionally assume that the activation functions
are continuous everywhere and differentiable almost everywhere. Our experiments
suggest favourable performance of our algorithms
- …