Search CORE

13,596 research outputs found

Graphical Object-Centric Actor-Critic

Author: Panov Aleksandr I.
Ugadiarov Leonid
Publication venue
Publication date: 26/10/2023
Field of study

There have recently been significant advances in the problem of unsupervised object-centric representation learning and its application to downstream tasks. The latest works support the argument that employing disentangled object representations in image-based object-centric reinforcement learning tasks facilitates policy learning. We propose a novel object-centric reinforcement learning algorithm combining actor-critic and model-based approaches to utilize these representations effectively. In our approach, we use a transformer encoder to extract object representations and graph neural networks to approximate the dynamics of an environment. The proposed method fills a research gap in developing efficient object-centric world models for reinforcement learning settings that can be used for environments with discrete or continuous action spaces. Our algorithm performs better in a visually complex 3D robotic environment and a 2D environment with compositional structure than the state-of-the-art model-free actor-critic algorithm built upon transformer architecture and the state-of-the-art monolithic model-based algorithm

arXiv.org e-Print Archive

Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog

Author: Batra Dhruv
Kottur Satwik
Lee Stefan
Moura José M. F.
Publication venue
Publication date: 01/01/2017
Field of study

A number of recent works have proposed techniques for end-to-end learning of communication protocols among cooperative multi-agent populations, and have simultaneously found the emergence of grounded human-interpretable language in the protocols developed by the agents, all learned without any human supervision! In this paper, using a Task and Tell reference game between two agents as a testbed, we present a sequence of 'negative' results culminating in a 'positive' one -- showing that while most agent-invented languages are effective (i.e. achieve near-perfect task rewards), they are decidedly not interpretable or compositional. In essence, we find that natural language does not emerge 'naturally', despite the semblance of ease of natural-language-emergence that one may gather from recent literature. We discuss how it is possible to coax the invented languages to become more and more human-like and compositional by increasing restrictions on how two agents may communicate.Comment: 9 pages, 7 figures, 2 tables, accepted at EMNLP 2017 as short pape

arXiv.org e-Print Archive

Crossref

Projective simulation for artificial intelligence

Author: Andy Clark
AP Hines
D Ormoneit
DanielL Schacter
David Deutsch
EdwardC Tolman
Ende Tulving
Germund Hesslow
Hendrik Weimer
Igor Antonov
JT Barreiro
Julia Kempe
Jun Tani
Lev Grover
Long-Ji Lin
M Toussaint
MartinV Butz
PreetiS Sareen
R Parr
Richard Feynman
RS Sutton
Sebastian Diehl
TG Dietterich
Publication venue
Publication date: 08/02/2013
Field of study

We propose a model of a learning agent whose interaction with the environment is governed by a simulation-based projection, which allows the agent to project itself into future situations before it takes real action. Projective simulation is based on a random walk through a network of clips, which are elementary patches of episodic memory. The network of clips changes dynamically, both due to new perceptual input and due to certain compositional principles of the simulation process. During simulation, the clips are screened for specific features which trigger factual action of the agent. The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning. Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.Comment: 22 pages, 18 figures. Close to published version, with footnotes retaine

arXiv.org e-Print Archive

Crossref

PubMed Central

CompILE: Compositional Imitation Learning and Execution

Author: Battaglia Peter
Dai Hanjun
Grefenstette Edward
Kipf Thomas
Kohli Pushmeet
Li Yujia
Sanchez-Gonzalez Alvaro
Zambaldi Vinicius
Publication venue
Publication date: 01/01/2019
Field of study

We introduce Compositional Imitation Learning and Execution (CompILE): a framework for learning reusable, variable-length segments of hierarchically-structured behavior from demonstration data. CompILE uses a novel unsupervised, fully-differentiable sequence segmentation module to learn latent encodings of sequential data that can be re-composed and executed to perform new tasks. Once trained, our model generalizes to sequences of longer length and from environment instances not seen during training. We evaluate CompILE in a challenging 2D multi-task environment and a continuous control task, and show that it can find correct task boundaries and event encodings in an unsupervised manner. Latent codes and associated behavior policies discovered by CompILE can be used by a hierarchical agent, where the high-level policy selects actions in the latent code space, and the low-level, task-specific policies are simply the learned decoders. We found that our CompILE-based agent could learn given only sparse rewards, where agents without task-specific policies struggle.Comment: ICML (2019

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE