Search CORE

129 research outputs found

Higher coordination with less control - A result of information maximization in the sensorimotor loop

Author: Ay Nihat
Der Ralf
Zahedi Keyan
Publication venue
Publication date: 01/01/2010
Field of study

This work presents a novel learning method in the context of embodied artificial intelligence and self-organization, which has as few assumptions and restrictions as possible about the world and the underlying model. The learning rule is derived from the principle of maximizing the predictive information in the sensorimotor loop. It is evaluated on robot chains of varying length with individually controlled, non-communicating segments. The comparison of the results shows that maximizing the predictive information per wheel leads to a higher coordinated behavior of the physically connected robots compared to a maximization per robot. Another focus of this paper is the analysis of the effect of the robot chain length on the overall behavior of the robots. It will be shown that longer chains with less capable controllers outperform those of shorter length and more complex controllers. The reason is found and discussed in the information-geometric interpretation of the learning process

arXiv.org e-Print Archive

From animals to animats 5: Proceedings of the fifth international conference on simulation of adaptive behavior Edited by Rolf Pfeifer, Bruce Blumberg, Jean-Arcady Meyer and Stewart W. Wilson. MIT Press, Cambridge, MA. (1998). 561 pages. $65.00

Author
Publication venue: Published by Elsevier Ltd.
Publication date
Field of study

Elsevier - Publisher Connector

POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem

Author: Schmidhuber Jürgen
Publication venue
Publication date: 01/01/2011
Field of study

Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Consider the infinite set of all computable descriptions of tasks with possibly computable solutions. The novel algorithmic framework POWERPLAY (2011) continually searches the space of possible pairs of new tasks and modifications of the current problem solver, until it finds a more powerful problem solver that provably solves all previously learned tasks plus the new one, while the unmodified predecessor does not. Wow-effects are achieved by continually making previously learned skills more efficient such that they require less time and space. New skills may (partially) re-use previously learned skills. POWERPLAY's search orders candidate pairs of tasks and solver modifications by their conditional computational (time & space) complexity, given the stored experience so far. The new task and its corresponding task-solving skill are those first found and validated. The computational costs of validating new tasks need not grow with task repertoire size. POWERPLAY's ongoing search for novelty keeps breaking the generalization abilities of its present solver. This is related to Goedel's sequence of increasingly powerful formal theories based on adding formerly unprovable statements to the axioms without affecting previously provable theorems. The continually increasing repertoire of problem solving procedures can be exploited by a parallel search for solutions to additional externally posed tasks. POWERPLAY may be viewed as a greedy but practical implementation of basic principles of creativity. A first experimental analysis can be found in separate papers [53,54].Comment: 21 pages, additional connections to previous work, references to first experiments with POWERPLA

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Spatial representation for navigation in animats

Author: Agre P.E.
Anderson J.R.
Arbib M.A.
Arbib M.A.
Beer R.D.
Brooks R.A.
Brooks R.A.
Brooks R.A.
Cartwright B.A.
Chatila R.
Clark A.
Connell J.
Connell J.H.
Farin G.
Gallistel C.R.
Hertz J.
Hinton G.E.
Iyengar S.
Jamon M.
Kortenkamp D.
Kuipers B.J.
Kuipers B.J.
Maes P.
Mataric M.J.
Mataric M.J.
Maybeck P.S.
Meyer J.
Neisser U.
O'Keefe J.
O'Keefe J.
O'Keefe J.A.
Olton D.S.
Piaget J.
Prescott T.J.
Prescott T.J.
Prescott T.J.
Presson C.
Rao N.
Roifblat H.L.
Soldo M.
Tony J. Prescott
Touretzky D.S.
Turchan M.
Wehner R.
Wilson S.W.
Zipser D.
Zipser D.
Zipser D.
Publication venue: 'SAGE Publications'
Publication date: 01/01/1996
Field of study

This article considers the problem of spatial representation for animat navigation systems. It is proposed that the global navigation task, or "wayfinding, " is best supported by multiple interacting subsystems, each of which builds its own partial representation of relevant world knowledge. Evidence from the study of animal navigation is reviewed to demonstrate that similar principles underlie the wayfinding behavior of animals, including humans. A simulated wayfinding system is described that embodies and illustrates several of the themes identified with animat navigation. This system constructs a network of partial models of the quantitative spatial relations between groups of salient landmarks. Navigation tasks are solved by propagating egocentric view information through this network, using a simple but effective heuristic to arbitrate between multiple solutions

Crossref

White Rose Research Online

An Intrinsically-Motivated Approach for Learning Highly Exploring and Fast Mixing Policies

Author: Mutti Mirco
Restelli Marcello
Publication venue
Publication date: 19/12/2019
Field of study

What is a good exploration strategy for an agent that interacts with an environment in the absence of external rewards? Ideally, we would like to get a policy driving towards a uniform state-action visitation (highly exploring) in a minimum number of steps (fast mixing), in order to ease efficient learning of any goal-conditioned policy later on. Unfortunately, it is remarkably arduous to directly learn an optimal policy of this nature. In this paper, we propose a novel surrogate objective for learning highly exploring and fast mixing policies, which focuses on maximizing a lower bound to the entropy of the steady-state distribution induced by the policy. In particular, we introduce three novel lower bounds, that lead to as many optimization problems, that tradeoff the theoretical guarantees with computational complexity. Then, we present a model-based reinforcement learning algorithm, IDE

^{3}

AL, to learn an optimal policy according to the introduced objective. Finally, we provide an empirical evaluation of this algorithm on a set of hard-exploration tasks.Comment: In 34th AAAI Conference on Artificial Intelligence (AAAI 2020

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Old tricks, new dogs : ethology and interactive creatures

Author: Blumberg Bruce Mitchell
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1997
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1997.Includes bibliographical references (p. 135-140).by Bruce Mitchell Blumberg.Ph.D

DSpace@MIT

Neuroevolution in Games: State of the Art and Open Challenges

Author: Risi Sebastian
Togelius Julian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

This paper surveys research on applying neuroevolution (NE) to games. In neuroevolution, artificial neural networks are trained through evolutionary algorithms, taking inspiration from the way biological brains evolved. We analyse the application of NE in games along five different axes, which are the role NE is chosen to play in a game, the different types of neural networks used, the way these networks are evolved, how the fitness is determined and what type of input the network receives. The article also highlights important open research challenges in the field.Comment: - Added more references - Corrected typos - Added an overview table (Table 1

arXiv.org e-Print Archive

CiteSeerX

Crossref

The IT University of Copenhagen's Repository

Perceptual abstraction and attention

This is a report on the preliminary achievements of WP4 of the IM-CleVeR project on abstraction for cumulative learning, in particular directed to: (1) producing algorithms to develop abstraction features under top-down action influence; (2) algorithms for supporting detection of change in motion pictures; (3) developing attention and vergence control on the basis of locally computed rewards; (4) searching abstract representations suitable for the LCAS framework; (5) developing predictors based on information theory to support novelty detection. The report is organized around these 5 tasks that are part of WP4. We provide a synthetic description of the work done for each task by the partners

PUblication MAnagement