202 research outputs found

    Combating catastrophic forgetting with developmental compression

    Full text link
    Generally intelligent agents exhibit successful behavior across problems in several settings. Endemic in approaches to realize such intelligence in machines is catastrophic forgetting: sequential learning corrupts knowledge obtained earlier in the sequence, or tasks antagonistically compete for system resources. Methods for obviating catastrophic forgetting have sought to identify and preserve features of the system necessary to solve one problem when learning to solve another, or to enforce modularity such that minimally overlapping sub-functions contain task specific knowledge. While successful, both approaches scale poorly because they require larger architectures as the number of training instances grows, causing different parts of the system to specialize for separate subsets of the data. Here we present a method for addressing catastrophic forgetting called developmental compression. It exploits the mild impacts of developmental mutations to lessen adverse changes to previously-evolved capabilities and `compresses' specialized neural networks into a generalized one. In the absence of domain knowledge, developmental compression produces systems that avoid overt specialization, alleviating the need to engineer a bespoke system for every task permutation and suggesting better scalability than existing approaches. We validate this method on a robot control problem and hope to extend this approach to other machine learning domains in the future

    Developing Toward Generality: Combating Catastrophic Forgetting with Developmental Compression

    Get PDF
    General intelligence is the exhibition of intelligent behavior across multiple problems in a variety of settings, however intelligence is defined and measured. Endemic in approaches to realize such intelligence in machines is catastrophic forgetting, in which sequential learning corrupts knowledge obtained earlier in the sequence or in which tasks antagonistically compete for system resources. Methods for obviating catastrophic forgetting have either sought to identify and preserve features of the system necessary to solve one problem when learning to solve another, or enforce modularity such that minimally overlapping sub-functions contain task-specific knowledge. While successful in some domains, both approaches scale poorly because they require larger architectures as the number of training instances grows, causing different parts of the system to specialize for separate subsets of the data. Presented here is a method called developmental compression that addresses catastrophic forgetting in the neural networks of embodied agents. It exploits the mild impacts of developmental mutations to lessen adverse changes to previously evolved capabilities and `compresses\u27 specialized neural networks into a single generalized one. In the absence of domain knowledge, developmental compression produces systems that avoid overt specialization, alleviating the need to engineer a bespoke system for every task permutation, and does so in a way that suggests better scalability than existing approaches. This method is validated on a robot control problem and may be extended to other machine learning domains in the future

    Exploring the effects of robotic design on learning and neural control

    Full text link
    The ongoing deep learning revolution has allowed computers to outclass humans in various games and perceive features imperceptible to humans during classification tasks. Current machine learning techniques have clearly distinguished themselves in specialized tasks. However, we have yet to see robots capable of performing multiple tasks at an expert level. Most work in this field is focused on the development of more sophisticated learning algorithms for a robot's controller given a largely static and presupposed robotic design. By focusing on the development of robotic bodies, rather than neural controllers, I have discovered that robots can be designed such that they overcome many of the current pitfalls encountered by neural controllers in multitask settings. Through this discovery, I also present novel metrics to explicitly measure the learning ability of a robotic design and its resistance to common problems such as catastrophic interference. Traditionally, the physical robot design requires human engineers to plan every aspect of the system, which is expensive and often relies on human intuition. In contrast, within the field of evolutionary robotics, evolutionary algorithms are used to automatically create optimized designs, however, such designs are often still limited in their ability to perform in a multitask setting. The metrics created and presented here give a novel path to automated design that allow evolved robots to synergize with their controller to improve the computational efficiency of their learning while overcoming catastrophic interference. Overall, this dissertation intimates the ability to automatically design robots that are more general purpose than current robots and that can perform various tasks while requiring less computation.Comment: arXiv admin note: text overlap with arXiv:2008.0639

    Adaptation in Deep Learning Models: Algorithms and Applications

    Get PDF
    Artificial intelligence has been successful to match or even surpass human abilities e.g., recognizing images, playing games, and understanding languages. At the current state, powerful machine learning models learn from data under a stationary environment while humans are capable of learning in dynamic, changing, and sequential conditions. In pursuing the idea of open-ended learning for machine intelligence, we contribute to provide algorithms and analyses for generally capable models via adaptation. In this thesis, the model adaptation problem is defined as the impediment of intelligent machines to learn to modify their behaviors for new purposes or new uses. The ultimate goal is to develop machine intelligence that has the ability to adapt itself by not only following at our behest but also understanding the environment. Our works populate in the area of deep neural networks and transfer learning. Throughout our works, developing adaptive models is divided into four major problems: (1) few-shot learning, (2) fast model adaptation, (3) continual learning, and (4) architecture search. In few-shot learning, a model is expected to change its behavior when facing a new context or an unseen task with limited data. Another important problem within few-shot learning is to adapt quickly from a few data. In the problem of continual learning, the model needs to adapt sequentially depending on the given task. In architecture search, we look for a high-performing configuration for connecting among nodes in a model. To approach the problem of few-shot learning, we opt to use the strategy in transfer learning with a pretrained Convolutional Neural Network (CNN) for novel tasks with limited-data annotations. Inspired by the success of subspace methods for visual recognition, we develop a classifier using subspaces to improve the generalization capability to novel concepts. We also investigate few-shot learning in multi-label classification, and propose a multi-label propagation technique by constructing a graph from the representations of support samples. In pursuing fast model adaptation, we use the idea of preconditioners in optimization. Specifically, the problem revolves in \textit{meta-learning}, where the agent needs to learn a family of tasks and adapt quickly to a new task. Our algorithm uses a non-linear function to generate the preconditioner for modulating the gradient when updating the model. Our experiments show that the model converges more quickly than other types of preconditioners in the same problem. In the problem of continual learning, the model needs to sequentially learn and adapt the network parameters for new tasks without forgetting the previously learned tasks. To this end, we investigate the knowledge distillation approach, where the old model guides the current model to find the balance between the current task and the prior tasks. Our approach models the smoothness between two tasks using the geodesic flow, and the objective is to maximize similarity of the projected responses along the geodesic flow. In neural architecture search, the optimal architecture depends on the task objectives. We observe that searching for an optimal architecture is not trivial while the data annotations is noisy. The study investigates the impact of label noise in obtaining the best performance when optimizing a neural architecture, while also reducing the performance deterioration because of overfitting to noisy labels. We use the mutual information bottleneck to design a noise injection module that can alleviate the impact of learning under label noise. In summary, our works in this thesis address some major problems in model adaptation e.g., few-shot learning, meta-learning, continual learning, and neural architecture search. The solutions are expected to contribute to the arsenal of model adaptation algorithms and the analyses shed light on the essential aspects in adaptation strategies

    The Journal of Undergraduate Research: Volume 09

    Get PDF
    This is the complete issue of the South Dakota State University Journal of Undergraduate Research, Volume 13

    Lifelong learning : rhetoric and meaning

    Get PDF

    Speaking of Ralph

    Get PDF
    A. Robert Lee interviews Ingrid Wendt

    Wounds and writing : building trauma-informed approaches to writing pedagogy.

    Get PDF
    This dissertation builds a trauma-informed approach to writing pedagogy informed by writing studies scholarship about trauma and inclusive pedagogy, clinical social work literature on trauma-informed care, and interviews with nine current University of Louisville writing faculty about their experiences academically supporting distressed students. I identify three central touchstones—“students are coddled,” “teacher’s aren’t therapists,” and “institutions don’t support trauma-informed teaching”—in scholarly and public debates regarding what to do about student trauma/distress in higher education. After exploring the valid concerns and misconceptions underpinning these touchstones, I illustrate how clinical research offers a way forward to help writing instructors develop more complex understandings of and responses to trauma’s impact on their classrooms. I conclude by describing six criteria that define Trauma-Informed Writing Pedagogy (TIWP), an approach to writing instruction that faculty and administrators can adapt to their own teaching styles and contexts. Appendix 2 describes TIWP in detail, offering suggestions, resources, and other materials. This instructional approach has important implications for fostering inclusive pedagogies and responding to mental health crises across college campuses
    • …
    corecore