14 research outputs found
Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks
A long-term goal of AI is to produce agents that can learn a diversity of
skills throughout their lifetimes and continuously improve those skills via
experience. A longstanding obstacle towards that goal is catastrophic
forgetting, which is when learning new information erases previously learned
information. Catastrophic forgetting occurs in artificial neural networks
(ANNs), which have fueled most recent advances in AI. A recent paper proposed
that catastrophic forgetting in ANNs can be reduced by promoting modularity,
which can limit forgetting by isolating task information to specific clusters
of nodes and connections (functional modules). While the prior work did show
that modular ANNs suffered less from catastrophic forgetting, it was not able
to produce ANNs that possessed task-specific functional modules, thereby
leaving the main theory regarding modularity and forgetting untested. We
introduce diffusion-based neuromodulation, which simulates the release of
diffusing, neuromodulatory chemicals within an ANN that can modulate (i.e. up
or down regulate) learning in a spatial region. On the simple diagnostic
problem from the prior work, diffusion-based neuromodulation 1) induces
task-specific learning in groups of nodes and connections (task-specific
localized learning), which 2) produces functional modules for each subtask, and
3) yields higher performance by eliminating catastrophic forgetting. Overall,
our results suggest that diffusion-based neuromodulation promotes task-specific
localized learning and functional modularity, which can help solve the
challenging, but important problem of catastrophic forgetting
Novelty search creates robots with general skills for exploration
ABSTRACT Novelty Search, a new type of Evolutionary Algorithm, has shown much promise in the last few years. Instead of selecting for phenotypes that are closer to an objective, Novelty Search assigns rewards based on how diāµerent the phenotypes are from those already generated. A common criticism of Novelty Search is that it is eāµectively random or exhaustive search because it tries solutions in an unordered manner until a correct one is found. Its creators respond that over time Novelty Search accumulates information about the environment in the form of skills relevant to reaching uncharted territory, but to date no evidence for that hypothesis has been presented. In this paper we test that hypothesis by transferring robots evolved under Novelty Search to new environments (here, mazes) to see if the skills they've acquired generalize. Three lines of evidence support the claim that Novelty Search agents do indeed learn general exploration skills. First, robot controllers evolved via Novelty Search in one maze and then transferred to a new maze explore significantly more of the new environment than nonevolved (randomly generated) agents. Second, a Novelty Search process to solve the new mazes works significantly faster when seeded with the transferred controllers versus randomly-generated ones. Third, no significant diāµerence exists when comparing two types of transferred agents: those evolved in the original maze under (1) Novelty Search vs. (2) a traditional, objective-based fitness function. The evidence gathered suggests that, like traditional Evolutionary Algorithms with objective-based fitness functions, Novelty Search is not a random or exhaustive search process, but instead is accumulating information about the environment, resulting in phenotypes possessing skills needed to explore their world. Keywords Novelty Search; Evolutionary Robotics Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
An illustration of (A) standard and (B) diffusion-based neuromodulation.
<p>For both, the activation of node 3 depends on the activations of nodes 0 and 1 and the connecting weights <i>w</i><sub>3,0</sub> and <i>w</i><sub>3,1</sub> (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0187736#pone.0187736.e003" target="_blank">Eq 3</a>). The changes in weights <i>w</i><sub>3,0</sub> and <i>w</i><sub>3,1</sub> rely on the sum of modulatory signals received by node 3 (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0187736#pone.0187736.e004" target="_blank">Eq 4</a>). <b>(A)</b> In standard neuromodulation, the modulatory signal comes from the direct connection from the modulatory node 2 (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0187736#pone.0187736.e005" target="_blank">Eq 5</a>). <b>(B)</b> In the diffusion-based neuromodulation implementation in this paper, the modulatory signal comes from the concentration gradients released by the point sources. In this example, the modulatory signal of node 3 is determined by its distance to the summer point source.</p
Diffusion treatments outperform non-diffusion treatments across various metrics.
<p><b>(A)</b> Across all generations diffusion treatments (<i>PA</i>_<i>D</i> & <i>PCC</i>_<i>D</i>) achieve significantly higher (<i>p</i> < 0.001) fitness than non-diffusion (<i>PA</i> & <i>PCC</i>) treatments. <b>(B)</b> Diffusion treatments maintain consistent fitness over their lifetime after the first two seasons, indicating they remember how to solve a task even after they have not performed that task for an entire season. Non-diffusion treatments do not. <b>(C)</b> Diffusion treatments have significantly higher (<i>p</i> < 0.001) Retained Percentages and Perfect (i.e. know both summer and winter) seasonal associations than non-diffusion. <b>(D)</b> Diffusion treatments posses significantly higher (<i>p</i> < 0.001) testing fitness than non-diffusion treatments. Throughout paper, all statistics are done with the Mann-Whitney U test. Markers below line plots indicate a significant difference (<i>p</i> < 0.001) between PA_D and the other treatments at the corresponding data point. For all bar plots, except when stated, a significance bar labeled with ā***ā is placed between bars that are significant at the level of <i>p</i> < 0.001. Lastly, the summary value and confidence intervals for all plots in this paper are the median and 75th and 25th percentiles respectively.</p
Combination of functional subnetworks to produce functional modules.
<p><b>(A)</b> ARK identifies the functional subnetworks for summer and winter, red and blue connections respectively, in an ANN. <b>(B)</b> Non-functional connections are removed. Functional subnetworks are combined to produce the final functional modules for summer (red connections), winter (blue connections), and common (green connections). <b>(C)</b> As a final visualization technique, connections from bias nodes are made thin.</p
Diffusion treatments change only connections that will become the summer and winter functional modules in those respective seasons, while non-diffusion treatments change either all connections (<i>PA</i>) or only common connections every season (<i>PCC</i>).
<p>Note, weight change is also occurring in the other connections within range of the points sources, but we plot only connections that eventually become functional and encode task information. Bars are not statistically compared to one another.</p
High-performing networks are differentiated from low-performing networks through the presence of distinct functional modules in Core Functional Networks (CFNs), network features not seen when just examining the original, non-simplified ANNs.
<p><b>(A)</b> Original ANNs for the networks with the best and worst test fitness. Inset text is the training fitness (trainF), testing fitness (testF), and structural modularity of the original ANN (origM) averaged over all 80 environments in the post-evolution analysis. Superficially there is nothing that distinguishes networks that have the best testing fitness. <b>(B)</b> One example CFN for each of the corresponding ANN from A. Inset text is the structural modularity of the original ANN (origM), training fitness (trainF), testing fitness (testF), CFN fitness (cfnF), and CFN modularity (cfnM) for the environment that produced the CFN. High-performing networks possess sparse CFNs with either two distinct functional modules (red and blue) that form separate paths, or a common functional module (green) that branches off into two distinct functional modules. Low-performing networks possess CFNs that are much more entangled, or do not connect to the decision input bits (input nodes marked with āDā) or season outputs. Structural modularity is quantified with the Q-Score metric [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0187736#pone.0187736.ref040" target="_blank">40</a>]. 20 additional CFNs for the best and worst individuals are provided in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0187736#pone.0187736.s005" target="_blank">S4</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0187736#pone.0187736.s006" target="_blank">S5</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0187736#pone.0187736.s007" target="_blank">S6</a> and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0187736#pone.0187736.s008" target="_blank">S7</a> Figs. For diffusion ANNs the locations of the point sources are indicated by small, purple, filled circles (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0187736#pone.0187736.s002" target="_blank">S1 Fig</a>) and the modulatory nodes for non-diffusion ANNs are indicated by circles with thick white borders. Nodes whose activation variance is below 1.0 Ć 10<sup>ā9</sup> are deemed to be bias nodes and are visualized with thin, outgoing connections.</p
The first four steps of the ARK procedure for finding the summer functional network.
<p>The first four steps of the ARK procedure for finding the summer functional network.</p