15,593 research outputs found

    Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting

    Get PDF
    We introduce the Kronecker factored online Laplace approximation for overcoming catastrophic forgetting in neural networks. The method is grounded in a Bayesian online learning framework, where we recursively approximate the posterior after every task with a Gaussian, leading to a quadratic penalty on changes to the weights. The Laplace approximation requires calculating the Hessian around a mode, which is typically intractable for modern architectures. In order to make our method scalable, we leverage recent block-diagonal Kronecker factored approximations to the curvature. Our algorithm achieves over 90% test accuracy across a sequence of 50 instantiations of the permuted MNIST dataset, substantially outperforming related methods for overcoming catastrophic forgetting.Comment: 13 pages, 6 figure

    Knowledge Spaces and the Completeness of Learning Strategies

    Get PDF
    We propose a theory of learning aimed to formalize some ideas underlying Coquand's game semantics and Krivine's realizability of classical logic. We introduce a notion of knowledge state together with a new topology, capturing finite positive and negative information that guides a learning strategy. We use a leading example to illustrate how non-constructive proofs lead to continuous and effective learning strategies over knowledge spaces, and prove that our learning semantics is sound and complete w.r.t. classical truth, as it is the case for Coquand's and Krivine's approaches

    Derivative-free online learning of inverse dynamics models

    Full text link
    This paper discusses online algorithms for inverse dynamics modelling in robotics. Several model classes including rigid body dynamics (RBD) models, data-driven models and semiparametric models (which are a combination of the previous two classes) are placed in a common framework. While model classes used in the literature typically exploit joint velocities and accelerations, which need to be approximated resorting to numerical differentiation schemes, in this paper a new `derivative-free' framework is proposed that does not require this preprocessing step. An extensive experimental study with real data from the right arm of the iCub robot is presented, comparing different model classes and estimation procedures, showing that the proposed `derivative-free' methods outperform existing methodologies.Comment: 14 pages, 11 figure

    Epistemic virtues, metavirtues, and computational complexity

    Get PDF
    I argue that considerations about computational complexity show that all finite agents need characteristics like those that have been called epistemic virtues. The necessity of these virtues follows in part from the nonexistence of shortcuts, or efficient ways of finding shortcuts, to cognitively expensive routines. It follows that agents must possess the capacities – metavirtues –of developing in advance the cognitive virtues they will need when time and memory are at a premium

    Recursive Neural Networks Can Learn Logical Semantics

    Full text link
    Tree-structured recursive neural networks (TreeRNNs) for sentence meaning have been successful for many applications, but it remains an open question whether the fixed-length representations that they learn can support tasks as demanding as logical deduction. We pursue this question by evaluating whether two such models---plain TreeRNNs and tree-structured neural tensor networks (TreeRNTNs)---can correctly learn to identify logical relationships such as entailment and contradiction using these representations. In our first set of experiments, we generate artificial data from a logical grammar and use it to evaluate the models' ability to learn to handle basic relational reasoning, recursive structures, and quantification. We then evaluate the models on the more natural SICK challenge data. Both models perform competitively on the SICK data and generalize well in all three experiments on simulated data, suggesting that they can learn suitable representations for logical inference in natural language

    Recurrent kernel machines : computing with infinite echo state networks

    Get PDF
    Echo state networks (ESNs) are large, random recurrent neural networks with a single trained linear readout layer. Despite the untrained nature of the recurrent weights, they are capable of performing universal computations on temporal input data, which makes them interesting for both theoretical research and practical applications. The key to their success lies in the fact that the network computes a broad set of nonlinear, spatiotemporal mappings of the input data, on which linear regression or classification can easily be performed. One could consider the reservoir as a spatiotemporal kernel, in which the mapping to a high-dimensional space is computed explicitly. In this letter, we build on this idea and extend the concept of ESNs to infinite-sized recurrent neural networks, which can be considered recursive kernels that subsequently can be used to create recursive support vector machines. We present the theoretical framework, provide several practical examples of recursive kernels, and apply them to typical temporal tasks

    On the Simulation of Polynomial NARMAX Models

    Get PDF
    In this paper, we show that the common approach for simulation non-linear stochastic models, commonly used in system identification, via setting the noise contributions to zero results in a biased response. We also demonstrate that to achieve unbiased simulation of finite order NARMAX models, in general, we require infinite order simulation models. The main contributions of the paper are two-fold. Firstly, an alternate representation of polynomial NARMAX models, based on Hermite polynomials, is proposed. The proposed representation provides a convenient way to translate a polynomial NARMAX model to a corresponding simulation model by simply setting certain terms to zero. This translation is exact when the simulation model can be written as an NFIR model. Secondly, a parameterized approximation method is proposed to curtail infinite order simulation models to a finite order. The proposed approximation can be viewed as a trade-off between the conventional approach of setting noise contributions to zero and the approach of incorporating the bias introduced by higher-order moments of the noise distribution. Simulation studies are provided to illustrate the utility of the proposed representation and approximation method.Comment: Accepted in IEEE CDC 201
    • …
    corecore