100 research outputs found

    Data-Efficient Learning using Modular Meta-Learning

    Get PDF
    Meta-learning, or learning to learn, has become well-known in the field of artificial intelligence as a technique for improving the learning performance of learning algorithms. It has been used to uncover the learning principles that allow learned models to effectively adapt and generalise to new tasks after deployment. Meta-learning via meta-loss learning is a framework that is used to train loss or reward functions that improve the sample efficiency, learning stability, and convergence speed of models trained under them. One of the models that can be improved using this framework is Neural Dynamic Policies (NDPs), which are made up of a deep neural network and a dynamical system. They can be used to predict trajectories given high-dimensional inputs, such as images. The objective of this thesis is to learn loss functions to speed up and stabilize the training process of complex policies. Specifically, this work aims to investigate the possibility of enhancing the performance of Neural Dynamic Policies using a meta-learning method for learning parametric loss functions in both supervised and reinforcement learning settings. To this end, the task is to learn to draw numbers using the S-mnist dataset and the results show that NDPs trained on the newly learned loss outperforms the baseline in terms of learning speed and sample efficiency

    Graph Element Networks: adaptive, structured computation and memory

    Full text link
    We explore the use of graph neural networks (GNNs) to model spatial processes in which there is no a priori graphical structure. Similar to finite element analysis, we assign nodes of a GNN to spatial locations and use a computational process defined on the graph to model the relationship between an initial function defined over a space and a resulting function in the same space. We use GNNs as a computational substrate, and show that the locations of the nodes in space as well as their connectivity can be optimized to focus on the most complex parts of the space. Moreover, this representational strategy allows the learned input-output relationship to generalize over the size of the underlying space and run the same model at different levels of precision, trading computation for accuracy. We demonstrate this method on a traditional PDE problem, a physical prediction problem from robotics, and learning to predict scene images from novel viewpoints.Comment: Accepted to ICML 201

    Meta-Learning Dynamics Forecasting Using Task Inference

    Full text link
    Current deep learning models for dynamics forecasting struggle with generalization. They can only forecast in a specific domain and fail when applied to systems with different parameters, external forces, or boundary conditions. We propose a model-based meta-learning method called DyAd which can generalize across heterogeneous domains by partitioning them into different tasks. DyAd has two parts: an encoder which infers the time-invariant hidden features of the task with weak supervision, and a forecaster which learns the shared dynamics of the entire domain. The encoder adapts and controls the forecaster during inference using adaptive instance normalization and adaptive padding. Theoretically, we prove that the generalization error of such procedure is related to the task relatedness in the source domain, as well as the domain differences between source and target. Experimentally, we demonstrate that our model outperforms state-of-the-art approaches on both turbulent flow and real-world ocean data forecasting tasks
    • …
    corecore