6,222 research outputs found
Projective simulation for artificial intelligence
We propose a model of a learning agent whose interaction with the environment
is governed by a simulation-based projection, which allows the agent to project
itself into future situations before it takes real action. Projective
simulation is based on a random walk through a network of clips, which are
elementary patches of episodic memory. The network of clips changes
dynamically, both due to new perceptual input and due to certain compositional
principles of the simulation process. During simulation, the clips are screened
for specific features which trigger factual action of the agent. The scheme is
different from other, computational, notions of simulation, and it provides a
new element in an embodied cognitive science approach to intelligent action and
learning. Our model provides a natural route for generalization to
quantum-mechanical operation and connects the fields of reinforcement learning
and quantum computation.Comment: 22 pages, 18 figures. Close to published version, with footnotes
retaine
Speeding-up the decision making of a learning agent using an ion trap quantum processor
We report a proof-of-principle experimental demonstration of the quantum
speed-up for learning agents utilizing a small-scale quantum information
processor based on radiofrequency-driven trapped ions. The decision-making
process of a quantum learning agent within the projective simulation paradigm
for machine learning is implemented in a system of two qubits. The latter are
realized using hyperfine states of two frequency-addressed atomic ions exposed
to a static magnetic field gradient. We show that the deliberation time of this
quantum learning agent is quadratically improved with respect to comparable
classical learning agents. The performance of this quantum-enhanced learning
agent highlights the potential of scalable quantum processors taking advantage
of machine learning.Comment: 21 pages, 7 figures, 2 tables. Author names now spelled correctly;
sections rearranged; changes in the wording of the manuscrip
Scalable Recollections for Continual Lifelong Learning
Given the recent success of Deep Learning applied to a variety of single
tasks, it is natural to consider more human-realistic settings. Perhaps the
most difficult of these settings is that of continual lifelong learning, where
the model must learn online over a continuous stream of non-stationary data. A
successful continual lifelong learning system must have three key capabilities:
it must learn and adapt over time, it must not forget what it has learned, and
it must be efficient in both training time and memory. Recent techniques have
focused their efforts primarily on the first two capabilities while questions
of efficiency remain largely unexplored. In this paper, we consider the problem
of efficient and effective storage of experiences over very large time-frames.
In particular we consider the case where typical experiences are O(n) bits and
memories are limited to O(k) bits for k << n. We present a novel scalable
architecture and training algorithm in this challenging domain and provide an
extensive evaluation of its performance. Our results show that we can achieve
considerable gains on top of state-of-the-art methods such as GEM.Comment: AAAI 201
Modeling the mobility of living organisms in heterogeneous landscapes: Does memory improve foraging success?
Thanks to recent technological advances, it is now possible to track with an
unprecedented precision and for long periods of time the movement patterns of
many living organisms in their habitat. The increasing amount of data available
on single trajectories offers the possibility of understanding how animals move
and of testing basic movement models. Random walks have long represented the
main description for micro-organisms and have also been useful to understand
the foraging behaviour of large animals. Nevertheless, most vertebrates, in
particular humans and other primates, rely on sophisticated cognitive tools
such as spatial maps, episodic memory and travel cost discounting. These
properties call for other modeling approaches of mobility patterns. We propose
a foraging framework where a learning mobile agent uses a combination of
memory-based and random steps. We investigate how advantageous it is to use
memory for exploiting resources in heterogeneous and changing environments. An
adequate balance of determinism and random exploration is found to maximize the
foraging efficiency and to generate trajectories with an intricate
spatio-temporal order. Based on this approach, we propose some tools for
analysing the non-random nature of mobility patterns in general.Comment: 14 pages, 4 figures, improved discussio
Exploring Restart Distributions
We consider the generic approach of using an experience memory to help
exploration by adapting a restart distribution. That is, given the capacity to
reset the state with those corresponding to the agent's past observations, we
help exploration by promoting faster state-space coverage via restarting the
agent from a more diverse set of initial states, as well as allowing it to
restart in states associated with significant past experiences. This approach
is compatible with both on-policy and off-policy methods. However, a caveat is
that altering the distribution of initial states could change the optimal
policies when searching within a restricted class of policies. To reduce this
unsought learning bias, we evaluate our approach in deep reinforcement learning
which benefits from the high representational capacity of deep neural networks.
We instantiate three variants of our approach, each inspired by an idea in the
context of experience replay. Using these variants, we show that performance
gains can be achieved, especially in hard exploration problems.Comment: RLDM 201
Capturing Regular Human Activity through a Learning Context Memory
A learning context memory consisting of two main parts is
presented. The first part performs lossy data compression,
keeping the amount of stored data at a minimum by combining
similar context attributes ā the compression rate for the
presented GPS data is 150:1 on average. The resulting data is
stored in an appropriate data structure highlighting the level
of compression. Elements with a high level of compression
are used in the second part to form the start and end points
of episodes capturing common activity consisting of consecutive
events. The context memory is used to investigate how
little context data can be stored containing still enough information
to capture regular human activity
Recommended from our members
Encoding Sequential Information in Vector Space Models of Semantics: Comparing Holographic Reduced Representation and Random Permutation
Encoding information about the order in which words typically appear has been shown to improve the performance of high-dimensional semantic space models. This requires an encoding operation capable of binding together vectors in an order-sensitive way, and efficient enough to scale to large text corpora. Although both circular convolution and random permutations have been enlisted for this purpose in semantic models, these operations have never been systematically compared. In Experiment 1 we compare their storage capacity and probability of correct retrieval; in Experiments 2 and 3 we compare their performance on semantic tasks when integrated into existing models. We conclude that random permutations are a scalable alternative to circular convolution with several desirable properties
Reward prediction error and declarative memory
Learning based on reward prediction error (RPE) was originally proposed in the context of nondeclarative memory. We postulate that RPE may support declarative memory as well. Indeed, recent years have witnessed a number of independent empirical studies reporting effects of RPE on declarative memory. We provide a brief overview of these studies, identify emerging patterns, and discuss open issues such as the role of signed versus unsigned RPEs in declarative learning
- ā¦