1,624 research outputs found

    Beyond the One Step Greedy Approach in Reinforcement Learning

    Get PDF
    The famous Policy Iteration algorithm alternates between policy improvement and policy evaluation. Implementations of this algorithm with several variants of the latter evaluation stage, e.g, nn-step and trace-based returns, have been analyzed in previous works. However, the case of multiple-step lookahead policy improvement, despite the recent increase in empirical evidence of its strength, has to our knowledge not been carefully analyzed yet. In this work, we introduce the first such analysis. Namely, we formulate variants of multiple-step policy improvement, derive new algorithms using these definitions and prove their convergence. Moreover, we show that recent prominent Reinforcement Learning algorithms are, in fact, instances of our framework. We thus shed light on their empirical success and give a recipe for deriving new algorithms for future study.Comment: ICML 201

    Building Machines That Learn and Think Like People

    Get PDF
    Recent progress in artificial intelligence (AI) has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn, and how they learn it. Specifically, we argue that these machines should (a) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (b) ground learning in intuitive theories of physics and psychology, to support and enrich the knowledge that is learned; and (c) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes towards these goals that can combine the strengths of recent neural network advances with more structured cognitive models.Comment: In press at Behavioral and Brain Sciences. Open call for commentary proposals (until Nov. 22, 2016). https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/information/calls-for-commentary/open-calls-for-commentar

    Exploring algorithms to recognize similar board states in Arimaa

    Get PDF
    The game of Arimaa was invented as a challenge to the field of game-playing artificial intelligence, which had grown somewhat haughty after IBM\u27s supercomputer Deep Blue trounced world champion Kasparov at chess. Although Arimaa is simple enough for a child to learn and can be played with an ordinary chess set, existing game-playing algorithms and techniques have had a difficult time rising up to the challenge of defeating the world\u27s best human Arimaa players, mainly due to the game\u27s impressive branching factor. This thesis introduces and analyzes new algorithms and techniques that attempt to recognize similar board states based on relative piece strength in a concentrated area of the board. Using this data, game-playing programs would be able to recognize patterns in order to discern tactics and moves that could lead to victory or defeat in similar situations based on prior experience
    corecore