47,953 research outputs found

    Frameworks for generalisation of flexible behaviour: insights from mouse behaviour and neural computation

    Get PDF
    It would be a daunting task to learn everything from scratch each time we are faced with a new problem. Instead, humans and other animals are helped by the fact that few situations in life are truly novel. Thus, we can draw upon previous experiences of similar situations and come up with our ‘best guess’ of how to deal with a new task at hand. In this thesis, I will attempt to answer the question of how this ability to generalise knowledge between different situations might be achieved in the brain. To be able to apply previously learnt knowledge to new but related situations, the common structure needs to be abstracted away from the sensorimotor specifics of experience. In my first set of experiments, I showed that mice, like humans, can learn to transfer knowledge between different sensory scenarios, and get better at solving each new sensory example of a problem. A series of behavioural analyses I conducted suggest that animals can abstract strategies away from the specifics on various time scales and could not only learn to generalise the short-term abstract sequences of events, but also a long-term strategy that integrated the recent history of animals’ experiences. In the second set of experiments, I focused on how these abstractions might be achieved in the brain, and how abstract representations might be tied to the sensorimotor specifics of each new situation. I found that neurons in medial prefrontal cortex (mPFC) maintained similar representations across multiple problems, despite their different sensorimotor correlates, whereas hippocampal (dCA1) representations were more strongly influenced by the specifics of each task. These data suggest that mPFC and hippocampus play complementary roles in generalisation of knowledge, with the former abstracting the common structure among related tasks, and the latter mapping this structure onto the specifics of the current situation. Together, these experiments aim to develop our understanding of a hallmark of biological intelligence – how prior experience is leveraged to solve novel, real-world problems

    Beasts in Collaboration: A Study of Biomimicry and Evolutionary Principles applied to Management Innovation

    Get PDF
    This research describes the frontier of bio-inspired management innovation and how it may lead to a paradigm shift in how we structure and lead organizations. As an exploratory foray into a subculture of bio-inspired experts, it asks how we might apply evolutionary principles to creating more resilient and adaptive organizations. The experts hail from both science-based and organizational management backgrounds, showcasing a distinct divergence in how biomimicry is applied in their work. A review of contributions from these pioneering practitioners discovers the impetus and resulting benefits of their application. This is contrasted with the barriers that currently limit further development of biomimicry for organizational change. Ultimately there remains a common understanding among these practitioners that involves the intention to learn from nature. The research therefore analyzes the study of nature for informed and intentional change, and provides examples of edge corporations leading the way. As we are frantically racing to reverse the consequences of our actions on the planet’s finite resources, the potential for a new paradigm that might consciously change how we model our organizations will have a direct impact on our resilience as a species

    Lifelong Sequence Generation with Dynamic Module Expansion and Adaptation

    Full text link
    Lifelong sequence generation (LSG), a problem in continual learning, aims to continually train a model on a sequence of generation tasks to learn constantly emerging new generation patterns while avoiding the forgetting of previous knowledge. Existing LSG methods mainly focus on maintaining old knowledge while paying little attention to knowledge transfer across tasks. In contrast, humans can better learn new tasks by leveraging previously acquired knowledge from similar tasks. Inspired by the learning paradigm of humans, we propose Dynamic Module Expansion and Adaptation (DMEA), which enables the model to dynamically determine the architecture for acquiring new knowledge based on task correlation and select the most similar previous tasks to facilitate adaptation to new tasks. In addition, as the learning process can easily be biased towards the current task which might cause more severe forgetting of previously learned knowledge, we propose dynamic gradient scaling to balance the learning of the current task and replayed tasks. With extensive experiments, we demonstrate that DMEA can consistently outperform existing methods in different LSG settings

    Contrastive Example-Based Control

    Full text link
    While many real-world problems that might benefit from reinforcement learning, these problems rarely fit into the MDP mold: interacting with the environment is often expensive and specifying reward functions is challenging. Motivated by these challenges, prior work has developed data-driven approaches that learn entirely from samples from the transition dynamics and examples of high-return states. These methods typically learn a reward function from high-return states, use that reward function to label the transitions, and then apply an offline RL algorithm to these transitions. While these methods can achieve good results on many tasks, they can be complex, often requiring regularization and temporal difference updates. In this paper, we propose a method for offline, example-based control that learns an implicit model of multi-step transitions, rather than a reward function. We show that this implicit model can represent the Q-values for the example-based control problem. Across a range of state-based and image-based offline control tasks, our method outperforms baselines that use learned reward functions; additional experiments demonstrate improved robustness and scaling with dataset size.Comment: This is an updated version of a manuscript that originally appeared at L4DC 2023. The project website is here https://sites.google.com/view/laeo-r

    Packet flow capacity autonomous operation based on reinforcement learning

    Get PDF
    As the dynamicity of the traffic increases, the need for self-network operation becomes more evident. One of the solutions that might bring cost savings to network operators is the dynamic capacity management of large packet flows, especially in the context of packet over optical networks. Machine Learning, particularly Reinforcement Learning, seems to be an enabler for autonomicity as a result of its inherent capacity to learn from experience. However, precisely because of that, RL methods might not be able to provide the required performance (e.g., delay, packet loss, and capacity overprovisioning) when managing the capacity of packet flows, until they learn the optimal policy. In view of that, we propose a management lifecycle with three phases: (i) a self-tuned threshold-based approach operating just after the packet flow is set up and until enough data on the traffic characteristics are available; (ii) an RL operation based on models pre-trained with a generic traffic profile; and (iii) an RL operation with models trained for real traffic. Exhaustive simulation results confirm the poor performance of RL algorithms until the optimal policy is learnt and when traffic characteristics change over time, which prevents deploying such methods in operators’ networks. In contrast, the proposed lifecycle outperforms benchmarking approaches, achieving noticeable performance from the beginning of operation while showing robustness against traffic changes.This research received funding from the European Community through the B5G-OPEN project (101016663), from the AEI IBON project (PID2020-114135RB-I00), and from the ICREA Institution.Peer ReviewedPostprint (published version

    REPRESENTING AND LEARNING PREFERENCES OVER COMBINATORIAL DOMAINS

    Get PDF
    Agents make decisions based on their preferences. Thus, to predict their decisions one has to learn the agent\u27s preferences. A key step in the learning process is selecting a model to represent those preferences. We studied this problem by borrowing techniques from the algorithm selection problem to analyze preference example sets and select the most appropriate preference representation for learning. We approached this problem in multiple steps. First, we determined which representations to consider. For this problem we developed the notion of preference representation language subsumption, which compares representations based on their expressive power. Subsumption creates a hierarchy of preference representations based solely on which preference orders they can express. By applying this analysis to preference representation languages over combinatorial domains we found that some languages are better for learning preference orders than others. Subsumption, however, does not tell the whole story. In the case of languages which approximate each other (another piece of useful information for learning) the subsumption relation cannot tell us which languages might serve as good approximations of others. How well one language approximates another often requires customized techniques. We developed such techniques for two important preference representation languages, conditional lexicographic preference models (CLPMs) and conditional preference networks (CP-nets). Second, we developed learning algorithms for highly expressive preference representations. To this end, we investigated using simulated annealing techniques to learn both ranking preference formulas (RPFs) and preference theories (PTs) preference programs. We demonstrated that simulated annealing is an effective approach to learn preferences under many different conditions. This suggested that more general learning strategies might lead to equally good or even better results. We studied this possibility by considering artificial neural networks (ANNs). Our research showed that ANNs can outperform classical models at deciding dominance, but have several significant drawbacks as preference reasoning models. Third, we developed a method for determining which representations match which example sets. For this classification task we considered two methods. In the first method we selected a series of features and used those features as input to a linear feed-forward ANN. The second method converts the example set into a graph and uses a graph convolutional neural network (GCNN). Between these two methods we found that the feature set approach works better. By completing these steps we have built the foundations of a portfolio based approach for learning preferences. We assembled a simple version of such a system as a proof of concept and tested its usefulness

    Addressing Dataset Bias in Deep Neural Networks

    Get PDF
    Deep Learning has achieved tremendous success in recent years in several areas such as image classification, text translation, autonomous agents, to name a few. Deep Neural Networks are able to learn non-linear features in a data-driven fashion from complex, large scale datasets to solve tasks. However, some fundamental issues remain to be fixed: the kind of data that is provided to the neural network directly influences its capability to generalize. This is especially true when training and test data come from different distributions (the so called domain gap or domain shift problem): in this case, the neural network may learn a data representation that is representative for the training data but not for the test, thus performing poorly when deployed in actual scenarios. The domain gap problem is addressed by the so-called Domain Adaptation, for which a large literature was recently developed. In this thesis, we first present a novel method to perform Unsupervised Domain Adaptation. Starting from the typical scenario in which we dispose of labeled source distributions and an unlabeled target distribution, we pursue a pseudo-labeling approach to assign a label to the target data, and then, in an iterative way, we refine them using Generative Adversarial Networks. Subsequently, we faced the debiasing problem. Simply speaking, bias occurs when there are factors in the data which are spuriously correlated with the task label, e.g., the background, which might be a strong clue to guess what class is depicted in an image. When this happens, neural networks may erroneously learn such spurious correlations as predictive factors, and may therefore fail when deployed on different scenarios. Learning a debiased model can be done using supervision regarding the type of bias affecting the data, or can be done without any annotation about what are the spurious correlations. We tackled the problem of supervised debiasing -- where a ground truth annotation for the bias is given -- under the lens of information theory. We designed a neural network architecture that learns to solve the task while achieving at the same time, statistical independence of the data embedding with respect to the bias label. We finally addressed the unsupervised debiasing problem, in which there is no availability of bias annotation. we address this challenging problem by a two-stage approach: we first split coarsely the training dataset into two subsets, samples that exhibit spurious correlations and those that do not. Second, we learn a feature representation that can accommodate both subsets and an augmented version of them

    Mitigating Catastrophic Forgetting for Few-Shot Spoken Word Classification Through Meta-Learning

    Full text link
    We consider the problem of few-shot spoken word classification in a setting where a model is incrementally introduced to new word classes. This would occur in a user-defined keyword system where new words can be added as the system is used. In such a continual learning scenario, a model might start to misclassify earlier words as newer classes are added, i.e. catastrophic forgetting. To address this, we propose an extension to model-agnostic meta-learning (MAML): each inner learning loop, where a model "learns how to learn'' new classes, ends with a single gradient update using stored templates from all the classes that the model has already seen (one template per class). We compare this method to OML (another extension of MAML) in few-shot isolated-word classification experiments on Google Commands and FACC. Our method consistently outperforms OML in experiments where the number of shots and the final number of classes are varied.Comment: 5 pages, 3 figures, Accepted to Interspeech 202
    • …
    corecore