Search CORE

184 research outputs found

Scaling all-goals updates in reinforcement learning using convolutional neural networks

Author: Kormushev P
Levdik V
Pardo F
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 25/11/2019
Field of study

Being able to reach any desired location in the environmentcan be a valuable asset for an agent. Learning a policy to nav-igate between all pairs of states individually is often not fea-sible. Anall-goals updatingalgorithm uses each transitionto learn Q-values towards all goals simultaneously and off-policy. However the expensive numerous updates in parallellimited the approach to small tabular cases so far. To tacklethis problem we propose to use convolutional network archi-tectures to generate Q-values and updates for a large numberof goals at once. We demonstrate the accuracy and generaliza-tion qualities of the proposed method on randomly generatedmazes and Sokoban puzzles. In the case of on-screen goalcoordinates the resulting mapping from frames todistance-mapsdirectly informs the agent about which places are reach-able and in how many steps. As an example of applicationwe show that replacing the random actions inε-greedy ex-ploration by several actions towards feasible goals generatesbetter exploratory trajectories on Montezuma’s Revenge andSuper Mario All-Stars games

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Association for the Advancement of Artificial Intelligence: AAAI Publications

Building Machines That Learn and Think Like People

Author: Gershman Samuel J.
Lake Brenden M.
Tenenbaum Joshua B.
Ullman Tomer D.
Publication venue
Publication date: 01/04/2016
Field of study

Recent progress in artificial intelligence (AI) has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn, and how they learn it. Specifically, we argue that these machines should (a) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (b) ground learning in intuitive theories of physics and psychology, to support and enrich the knowledge that is learned; and (c) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes towards these goals that can combine the strengths of recent neural network advances with more structured cognitive models.Comment: In press at Behavioral and Brain Sciences. Open call for commentary proposals (until Nov. 22, 2016). https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/information/calls-for-commentary/open-calls-for-commentar

arXiv.org e-Print Archive

DSpace@MIT

Learning to Identify Critical States for Reinforcement Learning from Videos

Author: Faccio Francesco
Ghanem Bernard
Li Bing
Liu Haozhe
Schmidhuber Jürgen
Wang Yuhui
Zhuge Mingchen
Publication venue
Publication date: 15/08/2023
Field of study

Recent work on deep reinforcement learning (DRL) has pointed out that algorithmic information about good policies can be extracted from offline data which lack explicit information about executed actions. For example, videos of humans or robots may convey a lot of implicit information about rewarding action sequences, but a DRL machine that wants to profit from watching such videos must first learn by itself to identify and recognize relevant states/actions/rewards. Without relying on ground-truth annotations, our new method called Deep State Identifier learns to predict returns from episodes encoded as videos. Then it uses a kind of mask-based sensitivity analysis to extract/identify important critical states. Extensive experiments showcase our method's potential for understanding and improving agent behavior. The source code and the generated datasets are available at https://github.com/AI-Initiative-KAUST/VideoRLCS.Comment: This paper was accepted to ICCV2

arXiv.org e-Print Archive

Machine learning algorithms for structured decision making

Author: Houthooft Rein
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2017
Field of study

Ghent University Academic Bibliography