4 research outputs found
Combating catastrophic forgetting with developmental compression
Generally intelligent agents exhibit successful behavior across problems in
several settings. Endemic in approaches to realize such intelligence in
machines is catastrophic forgetting: sequential learning corrupts knowledge
obtained earlier in the sequence, or tasks antagonistically compete for system
resources. Methods for obviating catastrophic forgetting have sought to
identify and preserve features of the system necessary to solve one problem
when learning to solve another, or to enforce modularity such that minimally
overlapping sub-functions contain task specific knowledge. While successful,
both approaches scale poorly because they require larger architectures as the
number of training instances grows, causing different parts of the system to
specialize for separate subsets of the data. Here we present a method for
addressing catastrophic forgetting called developmental compression. It
exploits the mild impacts of developmental mutations to lessen adverse changes
to previously-evolved capabilities and `compresses' specialized neural networks
into a generalized one. In the absence of domain knowledge, developmental
compression produces systems that avoid overt specialization, alleviating the
need to engineer a bespoke system for every task permutation and suggesting
better scalability than existing approaches. We validate this method on a robot
control problem and hope to extend this approach to other machine learning
domains in the future
Developing Toward Generality: Combating Catastrophic Forgetting with Developmental Compression
General intelligence is the exhibition of intelligent behavior across multiple problems in a variety of settings, however intelligence is defined and measured.
Endemic in approaches to realize such intelligence in machines is catastrophic forgetting, in which sequential learning corrupts knowledge obtained earlier in the sequence or in which tasks antagonistically compete for system resources. Methods for obviating catastrophic forgetting have either sought to identify and preserve features of the system necessary to solve one problem when learning to solve another, or enforce modularity such that minimally overlapping sub-functions contain task-specific knowledge. While successful in some domains, both approaches scale poorly because they require larger architectures as the number of training instances grows, causing different parts of the system to specialize for separate subsets of the data.
Presented here is a method called developmental compression that addresses catastrophic forgetting in the neural networks of embodied agents. It exploits the mild impacts of developmental mutations to lessen adverse changes to previously evolved capabilities and `compresses\u27 specialized neural networks into a single generalized one. In the absence of domain knowledge, developmental compression produces systems that avoid overt specialization, alleviating the need to engineer a bespoke system for every task permutation, and does so in a way that suggests better scalability than existing approaches. This method is validated on a robot control problem and may be extended to other machine learning domains in the future
Evolving Spatially Aggregated Features for Regional Modeling and its Application to Satellite Imagery
Satellite imagery and remote sensing provide explanatory variables at relatively high resolutions for modeling geospatial phenomena, yet regional summaries are often desirable for analysis and actionable insight. In this paper, we propose a novel method of inducing spatial aggregations as a component of the statistical learning process, yielding regional model features whose construction is driven by model prediction performance rather than prior assumptions. Our results demonstrate that Genetic Programming is particularly well suited to this type of feature construction because it can automatically synthesize appropriate aggregations, as well as better incorporate them into predictive models compared to other regression methods we tested. In our experiments we consider a specific problem instance and real-world dataset relevant to predicting snow properties in high-mountain Asia