1,624 research outputs found
Beyond the One Step Greedy Approach in Reinforcement Learning
The famous Policy Iteration algorithm alternates between policy improvement
and policy evaluation. Implementations of this algorithm with several variants
of the latter evaluation stage, e.g, -step and trace-based returns, have
been analyzed in previous works. However, the case of multiple-step lookahead
policy improvement, despite the recent increase in empirical evidence of its
strength, has to our knowledge not been carefully analyzed yet. In this work,
we introduce the first such analysis. Namely, we formulate variants of
multiple-step policy improvement, derive new algorithms using these definitions
and prove their convergence. Moreover, we show that recent prominent
Reinforcement Learning algorithms are, in fact, instances of our framework. We
thus shed light on their empirical success and give a recipe for deriving new
algorithms for future study.Comment: ICML 201
Building Machines That Learn and Think Like People
Recent progress in artificial intelligence (AI) has renewed interest in
building systems that learn and think like people. Many advances have come from
using deep neural networks trained end-to-end in tasks such as object
recognition, video games, and board games, achieving performance that equals or
even beats humans in some respects. Despite their biological inspiration and
performance achievements, these systems differ from human intelligence in
crucial ways. We review progress in cognitive science suggesting that truly
human-like learning and thinking machines will have to reach beyond current
engineering trends in both what they learn, and how they learn it.
Specifically, we argue that these machines should (a) build causal models of
the world that support explanation and understanding, rather than merely
solving pattern recognition problems; (b) ground learning in intuitive theories
of physics and psychology, to support and enrich the knowledge that is learned;
and (c) harness compositionality and learning-to-learn to rapidly acquire and
generalize knowledge to new tasks and situations. We suggest concrete
challenges and promising routes towards these goals that can combine the
strengths of recent neural network advances with more structured cognitive
models.Comment: In press at Behavioral and Brain Sciences. Open call for commentary
proposals (until Nov. 22, 2016).
https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/information/calls-for-commentary/open-calls-for-commentar
Exploring algorithms to recognize similar board states in Arimaa
The game of Arimaa was invented as a challenge to the field of game-playing artificial intelligence, which had grown somewhat haughty after IBM\u27s supercomputer Deep Blue trounced world champion Kasparov at chess. Although Arimaa is simple enough for a child to learn and can be played with an ordinary chess set, existing game-playing algorithms and techniques have had a difficult time rising up to the challenge of defeating the world\u27s best human Arimaa players, mainly due to the game\u27s impressive branching factor. This thesis introduces and analyzes new algorithms and techniques that attempt to recognize similar board states based on relative piece strength in a concentrated area of the board. Using this data, game-playing programs would be able to recognize patterns in order to discern tactics and moves that could lead to victory or defeat in similar situations based on prior experience
- …