35 research outputs found

    Context Meta-Reinforcement Learning via Neuromodulation

    Full text link
    Meta-reinforcement learning (meta-RL) algorithms enable agents to adapt quickly to tasks from few samples in dynamic environments. Such a feat is achieved through dynamic representations in an agent's policy network (obtained via reasoning about task context, model parameter updates, or both). However, obtaining rich dynamic representations for fast adaptation beyond simple benchmark problems is challenging due to the burden placed on the policy network to accommodate different policies. This paper addresses the challenge by introducing neuromodulation as a modular component to augment a standard policy network that regulates neuronal activities in order to produce efficient dynamic representations for task adaptation. The proposed extension to the policy network is evaluated across multiple discrete and continuous control environments of increasing complexity. To prove the generality and benefits of the extension in meta-RL, the neuromodulated network was applied to two state-of-the-art meta-RL algorithms (CAVIA and PEARL). The result demonstrates that meta-RL augmented with neuromodulation produces significantly better result and richer dynamic representations in comparison to the baselines

    Sliced Cramer synaptic consolidation for preserving deeply learned representations

    Get PDF
    Deep neural networks suffer from the inability to preserve the learned data representation (i.e., catastrophic forgetting) in domains where the input data distribution is non-stationary, and it changes during training. Various selective synaptic plasticity approaches have been recently proposed to preserve network parameters, which are crucial for previously learned tasks while learning new tasks. We explore such selective synaptic plasticity approaches through a unifying lens of memory replay and show the close relationship between methods like Elastic Weight Consolidation (EWC) and Memory-Aware-Synapses (MAS). We then propose a fundamentally different class of preservation methods that aim at preserving the distribution of the network’s output at an arbitrary layer for previous tasks while learning a new one. We propose the sliced Cramer distance as a suitable ´ choice for such preservation and evaluate our Sliced Cramer Preservation (SCP) ´ algorithm through extensive empirical investigations on various network architectures in both supervised and unsupervised learning settings. We show that SCP consistently utilizes the learning capacity of the network better than online-EWC and MAS methods on various incremental learning tasks

    Evolving unipolar memristor spiking neural networks

    Get PDF
    © 2015 Taylor & Francis. Neuromorphic computing – brain-like computing in hardware – typically requires myriad complimentary metal oxide semiconductor spiking neurons interconnected by a dense mesh of nanoscale plastic synapses. Memristors are frequently cited as strong synapse candidates due to their statefulness and potential for low-power implementations. To date, plentiful research has focused on the bipolar memristor synapse, which is capable of incremental weight alterations and can provide adaptive self-organisation under a Hebbian learning scheme. In this paper, we consider the unipolar memristor synapse – a device capable of non-Hebbian switching between only two states (conductive and resistive) through application of a suitable input voltage – and discuss its suitability for neuromorphic systems. A self-adaptive evolutionary process is used to autonomously find highly fit network configurations. Experimentation on two robotics tasks shows that unipolar memristor networks evolve task-solving controllers faster than both bipolar memristor networks and networks containing constant non-plastic connections whilst performing at least comparably

    Improving the Segmentation of Scanning Probe Microscope Images using Convolutional Neural Networks

    Full text link
    A wide range of techniques can be considered for segmentation of images of nanostructured surfaces. Manually segmenting these images is time-consuming and results in a user-dependent segmentation bias, while there is currently no consensus on the best automated segmentation methods for particular techniques, image classes, and samples. Any image segmentation approach must minimise the noise in the images to ensure accurate and meaningful statistical analysis can be carried out. Here we develop protocols for the segmentation of images of 2D assemblies of gold nanoparticles formed on silicon surfaces via deposition from an organic solvent. The evaporation of the solvent drives far-from-equilibrium self-organisation of the particles, producing a wide variety of nano- and micro-structured patterns. We show that a segmentation strategy using the U-Net convolutional neural network outperforms traditional automated approaches and has particular potential in the processing of images of nanostructured systems.Comment: 21 pages, 10 figure

    Indirectly Encoding Neural Plasticity as a Pattern of Local Rules

    Full text link
    Biological brains can adapt and learn from past experience. In neuroevolution, i.e. evolving artificial neural networks (ANNs), one way that agents controlled by ANNs can evolve the ability to adapt is by encoding local learning rules. However, a significant problem with most such approaches is that local learning rules for every connection in the network must be discovered separately. This paper aims to show that learning rules can be effectively indirectly encoded by extending the Hypercube-based NeuroEvolution of Augmenting Topologies (HyperNEAT) method. Adaptive HyperNEAT is introduced to allow not only patterns of weights across the connectivity of an ANN to be generated by a function of its geometry, but also patterns of arbitrary learning rules. Several such adaptive models with different levels of generality are explored and compared. The long-term promise of the new approach is to evolve large-scale adaptive ANNs, which is a major goal for neuroevolution. © 2010 Springer-Verlag

    Adaptation of Robot Behaviour through Online Evolution and Neuromodulated Learning

    Get PDF
    Abstract. We propose and evaluate a novel approach to the online syn-thesis of neural controllers for autonomous robots. We combine online evolution of weights and network topology with neuromodulated learn-ing. We demonstrate our method through a series of simulation-based ex-periments in which an e-puck-like robot must perform a dynamic concur-rent foraging task. In this task, scattered food items periodically change their nutritive value or become poisonous. Our results show that when neuromodulated learning is employed, neural controllers are synthesised faster than by evolution alone. We demonstrate that the online evolu-tionary process is capable of generating controllers well adapted to the periodic task changes. An analysis of the evolved networks shows that they are characterised by specialised modulatory neurons that exclusively regulate the output neurons

    Evolving spiking networks with variable resistive memories

    Get PDF
    Neuromorphic computing is a brainlike information processing paradigm that requires adaptive learning mechanisms. A spiking neuro-evolutionary system is used for this purpose; plastic resistive memories are implemented as synapses in spiking neural networks. The evolutionary design process exploits parameter self-adaptation and allows the topology and synaptic weights to be evolved for each network in an autonomous manner. Variable resistive memories are the focus of this research; each synapse has its own conductance profile which modifies the plastic behaviour of the device and may be altered during evolution. These variable resistive networks are evaluated on a noisy robotic dynamic-reward scenario against two static resistive memories and a system containing standard connections only. The results indicate that the extra behavioural degrees of freedom available to the networks incorporating variable resistive memories enable them to outperform the comparative synapse types. © 2014 by the Massachusetts Institute of Technology

    Increasing generality in machine learning through procedural content generation

    Get PDF
    Procedural Content Generation (PCG) refers to the practice, in videogames and other games, of generating content such as levels, quests, or characters algorithmically. Motivated by the need to make games replayable, as well as to reduce authoring burden, limit storage space requirements, and enable particular aesthetics, a large number of PCG methods have been devised by game developers. Additionally, researchers have explored adapting methods from machine learning, optimization, and constraint solving to PCG problems. Games have been widely used in AI research since the inception of the field, and in recent years have been used to develop and benchmark new machine learning algorithms. Through this practice, it has become more apparent that these algorithms are susceptible to overfitting. Often, an algorithm will not learn a general policy, but instead a policy that will only work for a particular version of a particular task with particular initial parameters. In response, researchers have begun exploring randomization of problem parameters to counteract such overfitting and to allow trained policies to more easily transfer from one environment to another, such as from a simulated robot to a robot in the real world. Here we review the large amount of existing work on PCG, which we believe has an important role to play in increasing the generality of machine learning methods. The main goal here is to present RL/AI with new tools from the PCG toolbox, and its secondary goal is to explain to game developers and researchers a way in which their work is relevant to AI research
    corecore