232 research outputs found

    Lifelong Generative Modeling

    Full text link
    Lifelong learning is the problem of learning multiple consecutive tasks in a sequential manner, where knowledge gained from previous tasks is retained and used to aid future learning over the lifetime of the learner. It is essential towards the development of intelligent machines that can adapt to their surroundings. In this work we focus on a lifelong learning approach to unsupervised generative modeling, where we continuously incorporate newly observed distributions into a learned model. We do so through a student-teacher Variational Autoencoder architecture which allows us to learn and preserve all the distributions seen so far, without the need to retain the past data nor the past models. Through the introduction of a novel cross-model regularizer, inspired by a Bayesian update rule, the student model leverages the information learned by the teacher, which acts as a probabilistic knowledge store. The regularizer reduces the effect of catastrophic interference that appears when we learn over sequences of distributions. We validate our model's performance on sequential variants of MNIST, FashionMNIST, PermutedMNIST, SVHN and Celeb-A and demonstrate that our model mitigates the effects of catastrophic interference faced by neural networks in sequential learning scenarios.Comment: 32 page

    Meta-Learning Evolutionary Artificial Neural Networks

    Full text link
    In this paper, we present MLEANN (Meta-Learning Evolutionary Artificial Neural Network), an automatic computational framework for the adaptive optimization of artificial neural networks wherein the neural network architecture, activation function, connection weights; learning algorithm and its parameters are adapted according to the problem. We explored the performance of MLEANN and conventionally designed artificial neural networks for function approximation problems. To evaluate the comparative performance, we used three different well-known chaotic time series. We also present the state of the art popular neural network learning algorithms and some experimentation results related to convergence speed and generalization performance. We explored the performance of backpropagation algorithm; conjugate gradient algorithm, quasi-Newton algorithm and Levenberg-Marquardt algorithm for the three chaotic time series. Performances of the different learning algorithms were evaluated when the activation functions and architecture were changed. We further present the theoretical background, algorithm, design strategy and further demonstrate how effective and inevitable is the proposed MLEANN framework to design a neural network, which is smaller, faster and with a better generalization performance

    Fast Learning by Bounding Likelihoods in Sigmoid Type Belief Networks

    Get PDF
    Sigmoid type belief networks, a class of probabilistic neural networks, provide a natural framework for compactly representing probabilistic information in a variety of unsupervised and supervised learning problems. Often the parameters used in these networks need to be learned from examples. Unfortunately, estimating the parameters via exact probabilistic calculations (i.e, the EM-algorithm) is intractable even for networks with fairly small numbers of hidden units. We propose to avoid the infeasibility of the E step by bounding likelihoods instead of computing them exactly. We introduce extended and complementary representations for these networks and show that the estimation of the network parameters can be made fast (reduced to quadratic optimization) by performing the estimation in either of the alternative domains. The complementary networks can be used for continuous density estimation as well

    Mean Field Methods for a Special Class of Belief Networks

    Full text link
    The chief aim of this paper is to propose mean-field approximations for a broad class of Belief networks, of which sigmoid and noisy-or networks can be seen as special cases. The approximations are based on a powerful mean-field theory suggested by Plefka. We show that Saul, Jaakkola and Jordan' s approach is the first order approximation in Plefka's approach, via a variational derivation. The application of Plefka's theory to belief networks is not computationally tractable. To tackle this problem we propose new approximations based on Taylor series. Small scale experiments show that the proposed schemes are attractive

    An Artificial Neural Network technique for on-line hotel booking

    Get PDF
    In this paper the use of Artificial Neural Networks (ANNs) in on-line booking for hotel industry is investigated. The paper details the description, the modeling and the resolution technique of on-line booking. The latter problem is modeled using the paradigms of machine learning, in place of standard `If-Then-Else' chains of conditional rules. In particular, a supervised three layers MLP neural network is adopted, which is trained using information from previous customers' reservations. Performance of our ANN is analyzed: it behaves in a quite satisfactory way in managing the (simulated) booking service in a hotel. The customer requires single or double rooms, while the system gives as a reply the confirmation of the required services, if available. Moreover, we highlight that using our approach the system proposes alternative accommodations (from two days in advance to two days later with respect to the requested day), in case rooms or services are not available. Numerical results are given, where the effectiveness of the proposed approach is critically analyzed. Finally, we outline guidelines for future research.On-line booking; hotel reservation; machine learning; supervised multilayer perceptron networks

    Cardinality-Minimal Explanations for Monotonic Neural Networks

    Full text link
    In recent years, there has been increasing interest in explanation methods for neural model predictions that offer precise formal guarantees. These include abductive (respectively, contrastive) methods, which aim to compute minimal subsets of input features that are sufficient for a given prediction to hold (respectively, to change a given prediction). The corresponding decision problems are, however, known to be intractable. In this paper, we investigate whether tractability can be regained by focusing on neural models implementing a monotonic function. Although the relevant decision problems remain intractable, we can show that they become solvable in polynomial time by means of greedy algorithms if we additionally assume that the activation functions are continuous everywhere and differentiable almost everywhere. Our experiments suggest favourable performance of our algorithms
    corecore