9,058 research outputs found
Continual Reinforcement Learning in 3D Non-stationary Environments
High-dimensional always-changing environments constitute a hard challenge for
current reinforcement learning techniques. Artificial agents, nowadays, are
often trained off-line in very static and controlled conditions in simulation
such that training observations can be thought as sampled i.i.d. from the
entire observations space. However, in real world settings, the environment is
often non-stationary and subject to unpredictable, frequent changes. In this
paper we propose and openly release CRLMaze, a new benchmark for learning
continually through reinforcement in a complex 3D non-stationary task based on
ViZDoom and subject to several environmental changes. Then, we introduce an
end-to-end model-free continual reinforcement learning strategy showing
competitive results with respect to four different baselines and not requiring
any access to additional supervised signals, previously encountered
environmental conditions or observations.Comment: Accepted in the CLVision Workshop at CVPR2020: 13 pages, 4 figures, 5
table
Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting
In lifelong learning systems, especially those based on artificial neural
networks, one of the biggest obstacles is the severe inability to retain old
knowledge as new information is encountered. This phenomenon is known as
catastrophic forgetting. In this article, we propose a new kind of
connectionist architecture, the Sequential Neural Coding Network, that is
robust to forgetting when learning from streams of data points and, unlike
networks of today, does not learn via the immensely popular back-propagation of
errors. Grounded in the neurocognitive theory of predictive processing, our
model adapts its synapses in a biologically-plausible fashion, while another,
complementary neural system rapidly learns to direct and control this
cortex-like structure by mimicking the task-executive control functionality of
the basal ganglia. In our experiments, we demonstrate that our self-organizing
system experiences significantly less forgetting as compared to standard neural
models and outperforms a wide swath of previously proposed methods even though
it is trained across task datasets in a stream-like fashion. The promising
performance of our complementary system on benchmarks, e.g., SplitMNIST, Split
Fashion MNIST, and Split NotMNIST, offers evidence that by incorporating
mechanisms prominent in real neuronal systems, such as competition, sparse
activation patterns, and iterative input processing, a new possibility for
tackling the grand challenge of lifelong machine learning opens up.Comment: Key updates including results on standard benchmarks, e.g., split
mnist/fmnist/not-mnist. Task selection/basal ganglia model has been
integrate
The Current State of Normative Agent-Based Systems
Recent years have seen an increase in the application of ideas from the social sciences to computational systems. Nowhere has this been more pronounced than in the domain of multiagent systems. Because multiagent systems are composed of multiple individual agents interacting with each other many parallels can be drawn to human and animal societies. One of the main challenges currently faced in multiagent systems research is that of social control. In particular, how can open multiagent systems be configured and organized given their constantly changing structure? One leading solution is to employ the use of social norms. In human societies, social norms are essential to regulation, coordination, and cooperation. The current trend of thinking is that these same principles can be applied to agent societies, of which multiagent systems are one type. In this article, we provide an introduction to and present a holistic viewpoint of the state of normative computing (computational solutions that employ ideas based on social norms.) To accomplish this, we (1) introduce social norms and their application to agent-based systems; (2) identify and describe a normative process abstracted from the existing research; and (3) discuss future directions for research in normative multiagent computing. The intent of this paper is to introduce new researchers to the ideas that underlie normative computing and survey the existing state of the art, as well as provide direction for future research.Norms, Normative Agents, Agents, Agent-Based System, Agent-Based Simulation, Agent-Based Modeling
DoShiCo Challenge: Domain Shift in Control Prediction
Training deep neural network policies end-to-end for real-world applications
so far requires big demonstration datasets in the real world or big sets
consisting of a large variety of realistic and closely related 3D CAD models.
These real or virtual data should, moreover, have very similar characteristics
to the conditions expected at test time. These stringent requirements and the
time consuming data collection processes that they entail, are currently the
most important impediment that keeps deep reinforcement learning from being
deployed in real-world applications. Therefore, in this work we advocate an
alternative approach, where instead of avoiding any domain shift by carefully
selecting the training data, the goal is to learn a policy that can cope with
it. To this end, we propose the DoShiCo challenge: to train a model in very
basic synthetic environments, far from realistic, in a way that it can be
applied in more realistic environments as well as take the control decisions on
real-world data. In particular, we focus on the task of collision avoidance for
drones. We created a set of simulated environments that can be used as
benchmark and implemented a baseline method, exploiting depth prediction as an
auxiliary task to help overcome the domain shift. Even though the policy is
trained in very basic environments, it can learn to fly without collisions in a
very different realistic simulated environment. Of course several benchmarks
for reinforcement learning already exist - but they never include a large
domain shift. On the other hand, several benchmarks in computer vision focus on
the domain shift, but they take the form of a static datasets instead of
simulated environments. In this work we claim that it is crucial to take the
two challenges together in one benchmark.Comment: Published at SIMPAR 2018. Please visit the paper webpage for more
information, a movie and code for reproducing results:
https://kkelchte.github.io/doshic
- ā¦