9,058 research outputs found

    Continual Reinforcement Learning in 3D Non-stationary Environments

    Full text link
    High-dimensional always-changing environments constitute a hard challenge for current reinforcement learning techniques. Artificial agents, nowadays, are often trained off-line in very static and controlled conditions in simulation such that training observations can be thought as sampled i.i.d. from the entire observations space. However, in real world settings, the environment is often non-stationary and subject to unpredictable, frequent changes. In this paper we propose and openly release CRLMaze, a new benchmark for learning continually through reinforcement in a complex 3D non-stationary task based on ViZDoom and subject to several environmental changes. Then, we introduce an end-to-end model-free continual reinforcement learning strategy showing competitive results with respect to four different baselines and not requiring any access to additional supervised signals, previously encountered environmental conditions or observations.Comment: Accepted in the CLVision Workshop at CVPR2020: 13 pages, 4 figures, 5 table

    Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting

    Full text link
    In lifelong learning systems, especially those based on artificial neural networks, one of the biggest obstacles is the severe inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this article, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points and, unlike networks of today, does not learn via the immensely popular back-propagation of errors. Grounded in the neurocognitive theory of predictive processing, our model adapts its synapses in a biologically-plausible fashion, while another, complementary neural system rapidly learns to direct and control this cortex-like structure by mimicking the task-executive control functionality of the basal ganglia. In our experiments, we demonstrate that our self-organizing system experiences significantly less forgetting as compared to standard neural models and outperforms a wide swath of previously proposed methods even though it is trained across task datasets in a stream-like fashion. The promising performance of our complementary system on benchmarks, e.g., SplitMNIST, Split Fashion MNIST, and Split NotMNIST, offers evidence that by incorporating mechanisms prominent in real neuronal systems, such as competition, sparse activation patterns, and iterative input processing, a new possibility for tackling the grand challenge of lifelong machine learning opens up.Comment: Key updates including results on standard benchmarks, e.g., split mnist/fmnist/not-mnist. Task selection/basal ganglia model has been integrate

    The Current State of Normative Agent-Based Systems

    Get PDF
    Recent years have seen an increase in the application of ideas from the social sciences to computational systems. Nowhere has this been more pronounced than in the domain of multiagent systems. Because multiagent systems are composed of multiple individual agents interacting with each other many parallels can be drawn to human and animal societies. One of the main challenges currently faced in multiagent systems research is that of social control. In particular, how can open multiagent systems be configured and organized given their constantly changing structure? One leading solution is to employ the use of social norms. In human societies, social norms are essential to regulation, coordination, and cooperation. The current trend of thinking is that these same principles can be applied to agent societies, of which multiagent systems are one type. In this article, we provide an introduction to and present a holistic viewpoint of the state of normative computing (computational solutions that employ ideas based on social norms.) To accomplish this, we (1) introduce social norms and their application to agent-based systems; (2) identify and describe a normative process abstracted from the existing research; and (3) discuss future directions for research in normative multiagent computing. The intent of this paper is to introduce new researchers to the ideas that underlie normative computing and survey the existing state of the art, as well as provide direction for future research.Norms, Normative Agents, Agents, Agent-Based System, Agent-Based Simulation, Agent-Based Modeling

    DoShiCo Challenge: Domain Shift in Control Prediction

    Full text link
    Training deep neural network policies end-to-end for real-world applications so far requires big demonstration datasets in the real world or big sets consisting of a large variety of realistic and closely related 3D CAD models. These real or virtual data should, moreover, have very similar characteristics to the conditions expected at test time. These stringent requirements and the time consuming data collection processes that they entail, are currently the most important impediment that keeps deep reinforcement learning from being deployed in real-world applications. Therefore, in this work we advocate an alternative approach, where instead of avoiding any domain shift by carefully selecting the training data, the goal is to learn a policy that can cope with it. To this end, we propose the DoShiCo challenge: to train a model in very basic synthetic environments, far from realistic, in a way that it can be applied in more realistic environments as well as take the control decisions on real-world data. In particular, we focus on the task of collision avoidance for drones. We created a set of simulated environments that can be used as benchmark and implemented a baseline method, exploiting depth prediction as an auxiliary task to help overcome the domain shift. Even though the policy is trained in very basic environments, it can learn to fly without collisions in a very different realistic simulated environment. Of course several benchmarks for reinforcement learning already exist - but they never include a large domain shift. On the other hand, several benchmarks in computer vision focus on the domain shift, but they take the form of a static datasets instead of simulated environments. In this work we claim that it is crucial to take the two challenges together in one benchmark.Comment: Published at SIMPAR 2018. Please visit the paper webpage for more information, a movie and code for reproducing results: https://kkelchte.github.io/doshic
    • ā€¦
    corecore