35 research outputs found
Context Meta-Reinforcement Learning via Neuromodulation
Meta-reinforcement learning (meta-RL) algorithms enable agents to adapt
quickly to tasks from few samples in dynamic environments. Such a feat is
achieved through dynamic representations in an agent's policy network (obtained
via reasoning about task context, model parameter updates, or both). However,
obtaining rich dynamic representations for fast adaptation beyond simple
benchmark problems is challenging due to the burden placed on the policy
network to accommodate different policies. This paper addresses the challenge
by introducing neuromodulation as a modular component to augment a standard
policy network that regulates neuronal activities in order to produce efficient
dynamic representations for task adaptation. The proposed extension to the
policy network is evaluated across multiple discrete and continuous control
environments of increasing complexity. To prove the generality and benefits of
the extension in meta-RL, the neuromodulated network was applied to two
state-of-the-art meta-RL algorithms (CAVIA and PEARL). The result demonstrates
that meta-RL augmented with neuromodulation produces significantly better
result and richer dynamic representations in comparison to the baselines
Sliced Cramer synaptic consolidation for preserving deeply learned representations
Deep neural networks suffer from the inability to preserve the learned data representation (i.e., catastrophic forgetting) in domains where the input data distribution is non-stationary, and it changes during training. Various selective synaptic
plasticity approaches have been recently proposed to preserve network parameters, which are crucial for previously learned tasks while learning new tasks.
We explore such selective synaptic plasticity approaches through a unifying lens
of memory replay and show the close relationship between methods like Elastic
Weight Consolidation (EWC) and Memory-Aware-Synapses (MAS). We then propose a fundamentally different class of preservation methods that aim at preserving the distribution of the network’s output at an arbitrary layer for previous tasks
while learning a new one. We propose the sliced Cramer distance as a suitable ´
choice for such preservation and evaluate our Sliced Cramer Preservation (SCP) ´
algorithm through extensive empirical investigations on various network architectures in both supervised and unsupervised learning settings. We show that SCP
consistently utilizes the learning capacity of the network better than online-EWC
and MAS methods on various incremental learning tasks
Evolving unipolar memristor spiking neural networks
© 2015 Taylor & Francis. Neuromorphic computing – brain-like computing in hardware – typically requires myriad complimentary metal oxide semiconductor spiking neurons interconnected by a dense mesh of nanoscale plastic synapses. Memristors are frequently cited as strong synapse candidates due to their statefulness and potential for low-power implementations. To date, plentiful research has focused on the bipolar memristor synapse, which is capable of incremental weight alterations and can provide adaptive self-organisation under a Hebbian learning scheme. In this paper, we consider the unipolar memristor synapse – a device capable of non-Hebbian switching between only two states (conductive and resistive) through application of a suitable input voltage – and discuss its suitability for neuromorphic systems. A self-adaptive evolutionary process is used to autonomously find highly fit network configurations. Experimentation on two robotics tasks shows that unipolar memristor networks evolve task-solving controllers faster than both bipolar memristor networks and networks containing constant non-plastic connections whilst performing at least comparably
Improving the Segmentation of Scanning Probe Microscope Images using Convolutional Neural Networks
A wide range of techniques can be considered for segmentation of images of
nanostructured surfaces. Manually segmenting these images is time-consuming and
results in a user-dependent segmentation bias, while there is currently no
consensus on the best automated segmentation methods for particular techniques,
image classes, and samples. Any image segmentation approach must minimise the
noise in the images to ensure accurate and meaningful statistical analysis can
be carried out. Here we develop protocols for the segmentation of images of 2D
assemblies of gold nanoparticles formed on silicon surfaces via deposition from
an organic solvent. The evaporation of the solvent drives far-from-equilibrium
self-organisation of the particles, producing a wide variety of nano- and
micro-structured patterns. We show that a segmentation strategy using the U-Net
convolutional neural network outperforms traditional automated approaches and
has particular potential in the processing of images of nanostructured systems.Comment: 21 pages, 10 figure
Indirectly Encoding Neural Plasticity as a Pattern of Local Rules
Biological brains can adapt and learn from past experience. In neuroevolution, i.e. evolving artificial neural networks (ANNs), one way that agents controlled by ANNs can evolve the ability to adapt is by encoding local learning rules. However, a significant problem with most such approaches is that local learning rules for every connection in the network must be discovered separately. This paper aims to show that learning rules can be effectively indirectly encoded by extending the Hypercube-based NeuroEvolution of Augmenting Topologies (HyperNEAT) method. Adaptive HyperNEAT is introduced to allow not only patterns of weights across the connectivity of an ANN to be generated by a function of its geometry, but also patterns of arbitrary learning rules. Several such adaptive models with different levels of generality are explored and compared. The long-term promise of the new approach is to evolve large-scale adaptive ANNs, which is a major goal for neuroevolution. © 2010 Springer-Verlag
Adaptation of Robot Behaviour through Online Evolution and Neuromodulated Learning
Abstract. We propose and evaluate a novel approach to the online syn-thesis of neural controllers for autonomous robots. We combine online evolution of weights and network topology with neuromodulated learn-ing. We demonstrate our method through a series of simulation-based ex-periments in which an e-puck-like robot must perform a dynamic concur-rent foraging task. In this task, scattered food items periodically change their nutritive value or become poisonous. Our results show that when neuromodulated learning is employed, neural controllers are synthesised faster than by evolution alone. We demonstrate that the online evolu-tionary process is capable of generating controllers well adapted to the periodic task changes. An analysis of the evolved networks shows that they are characterised by specialised modulatory neurons that exclusively regulate the output neurons
Evolving spiking networks with variable resistive memories
Neuromorphic computing is a brainlike information processing paradigm that requires adaptive learning mechanisms. A spiking neuro-evolutionary system is used for this purpose; plastic resistive memories are implemented as synapses in spiking neural networks. The evolutionary design process exploits parameter self-adaptation and allows the topology and synaptic weights to be evolved for each network in an autonomous manner. Variable resistive memories are the focus of this research; each synapse has its own conductance profile which modifies the plastic behaviour of the device and may be altered during evolution. These variable resistive networks are evaluated on a noisy robotic dynamic-reward scenario against two static resistive memories and a system containing standard connections only. The results indicate that the extra behavioural degrees of freedom available to the networks incorporating variable resistive memories enable them to outperform the comparative synapse types. © 2014 by the Massachusetts Institute of Technology
Increasing generality in machine learning through procedural content generation
Procedural Content Generation (PCG) refers to the practice, in videogames and
other games, of generating content such as levels, quests, or characters
algorithmically. Motivated by the need to make games replayable, as well as to
reduce authoring burden, limit storage space requirements, and enable
particular aesthetics, a large number of PCG methods have been devised by game
developers. Additionally, researchers have explored adapting methods from
machine learning, optimization, and constraint solving to PCG problems. Games
have been widely used in AI research since the inception of the field, and in
recent years have been used to develop and benchmark new machine learning
algorithms. Through this practice, it has become more apparent that these
algorithms are susceptible to overfitting. Often, an algorithm will not learn a
general policy, but instead a policy that will only work for a particular
version of a particular task with particular initial parameters. In response,
researchers have begun exploring randomization of problem parameters to
counteract such overfitting and to allow trained policies to more easily
transfer from one environment to another, such as from a simulated robot to a
robot in the real world. Here we review the large amount of existing work on
PCG, which we believe has an important role to play in increasing the
generality of machine learning methods. The main goal here is to present RL/AI
with new tools from the PCG toolbox, and its secondary goal is to explain to
game developers and researchers a way in which their work is relevant to AI
research