Search CORE

14,098 research outputs found

Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups

Author: Lanctot Marc
Pepels Tom
Sturtevant Nathan R.
Winands Mark H. M.
Publication venue
Publication date: 01/01/2014
Field of study

Monte Carlo Tree Search (MCTS) has improved the performance of game engines in domains such as Go, Hex, and general game playing. MCTS has been shown to outperform classic alpha-beta search in games where good heuristic evaluations are difficult to obtain. In recent years, combining ideas from traditional minimax search in MCTS has been shown to be advantageous in some domains, such as Lines of Action, Amazons, and Breakthrough. In this paper, we propose a new way to use heuristic evaluations to guide the MCTS search by storing the two sources of information, estimated win rates and heuristic evaluations, separately. Rather than using the heuristic evaluations to replace the playouts, our technique backs them up implicitly during the MCTS simulations. These minimax values are then used to guide future simulations. We show that using implicit minimax backups leads to stronger play performance in Kalah, Breakthrough, and Lines of Action.Comment: 24 pages, 7 figures, 9 tables, expanded version of paper presented at IEEE Conference on Computational Intelligence and Games (CIG) 2014 conferenc

arXiv.org e-Print Archive

CiteSeerX

Detecting and Characterizing Small Dense Bipartite-like Subgraphs by the Bipartiteness Ratio Measure

Author: A. Sinclair
J. Leskovec
L. Lovász
M. Charikar
N. Alon
N. Alon
P. Peng
R. Andersen
R. Andersen
R. Kannan
R. O’Donnell
S. Khuller
S.-H. Teng
T.C. Kwok
U. Feige
Publication venue
Publication date: 01/01/2013
Field of study

We study the problem of finding and characterizing subgraphs with small \textit{bipartiteness ratio}. We give a bicriteria approximation algorithm \verb|SwpDB| such that if there exists a subset

S

of volume at most

k

and bipartiteness ratio

\theta

, then for any

0<\epsilon<1/2

, it finds a set

S'

of volume at most

2k^{1+\epsilon}

and bipartiteness ratio at most

4\sqrt{\theta/\epsilon}

. By combining a truncation operation, we give a local algorithm \verb|LocDB|, which has asymptotically the same approximation guarantee as the algorithm \verb|SwpDB| on both the volume and bipartiteness ratio of the output set, and runs in time

O(\epsilon^2\theta^{-2}k^{1+\epsilon}\ln^3k)

, independent of the size of the graph. Finally, we give a spectral characterization of the small dense bipartite-like subgraphs by using the

k

th \textit{largest} eigenvalue of the Laplacian of the graph.Comment: 17 pages; ISAAC 201

arXiv.org e-Print Archive

Learn to Interpret Atari Agents

Author: Bai Song
Torr Philip H. S.
Yang Zhao
Zhang Li
Publication venue
Publication date: 24/01/2019
Field of study

Deep Reinforcement Learning (DeepRL) agents surpass human-level performances in a multitude of tasks. However, the direct mapping from states to actions makes it hard to interpret the rationale behind the decision making of agents. In contrast to previous a-posteriori methods of visualizing DeepRL policies, we propose an end-to-end trainable framework based on Rainbow, a representative Deep Q-Network (DQN) agent. Our method automatically learns important regions in the input domain, which enables characterizations of the decision making and interpretations for non-intuitive behaviors. Hence we name it Region Sensitive Rainbow (RS-Rainbow). RS-Rainbow utilizes a simple yet effective mechanism to incorporate visualization ability into the learning model, not only improving model interpretability, but leading to improved performance. Extensive experiments on the challenging platform of Atari 2600 demonstrate the superiority of RS-Rainbow. In particular, our agent achieves state of the art at just 25% of the training frames. Demonstrations and code are available at https://github.com/yz93/Learn-to-Interpret-Atari-Agents

arXiv.org e-Print Archive