4 research outputs found

    Combating catastrophic forgetting with developmental compression

    Full text link
    Generally intelligent agents exhibit successful behavior across problems in several settings. Endemic in approaches to realize such intelligence in machines is catastrophic forgetting: sequential learning corrupts knowledge obtained earlier in the sequence, or tasks antagonistically compete for system resources. Methods for obviating catastrophic forgetting have sought to identify and preserve features of the system necessary to solve one problem when learning to solve another, or to enforce modularity such that minimally overlapping sub-functions contain task specific knowledge. While successful, both approaches scale poorly because they require larger architectures as the number of training instances grows, causing different parts of the system to specialize for separate subsets of the data. Here we present a method for addressing catastrophic forgetting called developmental compression. It exploits the mild impacts of developmental mutations to lessen adverse changes to previously-evolved capabilities and `compresses' specialized neural networks into a generalized one. In the absence of domain knowledge, developmental compression produces systems that avoid overt specialization, alleviating the need to engineer a bespoke system for every task permutation and suggesting better scalability than existing approaches. We validate this method on a robot control problem and hope to extend this approach to other machine learning domains in the future

    Developing Toward Generality: Combating Catastrophic Forgetting with Developmental Compression

    Get PDF
    General intelligence is the exhibition of intelligent behavior across multiple problems in a variety of settings, however intelligence is defined and measured. Endemic in approaches to realize such intelligence in machines is catastrophic forgetting, in which sequential learning corrupts knowledge obtained earlier in the sequence or in which tasks antagonistically compete for system resources. Methods for obviating catastrophic forgetting have either sought to identify and preserve features of the system necessary to solve one problem when learning to solve another, or enforce modularity such that minimally overlapping sub-functions contain task-specific knowledge. While successful in some domains, both approaches scale poorly because they require larger architectures as the number of training instances grows, causing different parts of the system to specialize for separate subsets of the data. Presented here is a method called developmental compression that addresses catastrophic forgetting in the neural networks of embodied agents. It exploits the mild impacts of developmental mutations to lessen adverse changes to previously evolved capabilities and `compresses\u27 specialized neural networks into a single generalized one. In the absence of domain knowledge, developmental compression produces systems that avoid overt specialization, alleviating the need to engineer a bespoke system for every task permutation, and does so in a way that suggests better scalability than existing approaches. This method is validated on a robot control problem and may be extended to other machine learning domains in the future

    Evolving Spatially Aggregated Features for Regional Modeling and its Application to Satellite Imagery

    Get PDF
    Satellite imagery and remote sensing provide explanatory variables at relatively high resolutions for modeling geospatial phenomena, yet regional summaries are often desirable for analysis and actionable insight. In this paper, we propose a novel method of inducing spatial aggregations as a component of the statistical learning process, yielding regional model features whose construction is driven by model prediction performance rather than prior assumptions. Our results demonstrate that Genetic Programming is particularly well suited to this type of feature construction because it can automatically synthesize appropriate aggregations, as well as better incorporate them into predictive models compared to other regression methods we tested. In our experiments we consider a specific problem instance and real-world dataset relevant to predicting snow properties in high-mountain Asia
    corecore