Search CORE

3,027 research outputs found

A multi-paradigm language for reactive synthesis

Author: Filippidis Ioannis
Holzmann Gerard J.
Murray Richard M.
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2016
Field of study

This paper proposes a language for describing reactive synthesis problems that integrates imperative and declarative elements. The semantics is defined in terms of two-player turn-based infinite games with full information. Currently, synthesis tools accept linear temporal logic (LTL) as input, but this description is less structured and does not facilitate the expression of sequential constraints. This motivates the use of a structured programming language to specify synthesis problems. Transition systems and guarded commands serve as imperative constructs, expressed in a syntax based on that of the modeling language Promela. The syntax allows defining which player controls data and control flow, and separating a program into assumptions and guarantees. These notions are necessary for input to game solvers. The integration of imperative and declarative paradigms allows using the paradigm that is most appropriate for expressing each requirement. The declarative part is expressed in the LTL fragment of generalized reactivity(1), which admits efficient synthesis algorithms, extended with past LTL. The implementation translates Promela to input for the Slugs synthesizer and is written in Python. The AMBA AHB bus case study is revisited and synthesized efficiently, identifying the need to reorder binary decision diagrams during strategy construction, in order to prevent the exponential blowup observed in previous work.Comment: In Proceedings SYNT 2015, arXiv:1602.0078

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Caltech Authors

A Complete Solver for Constraint Games

Author: Lallouet Arnaud
Nguyen Thi-Van-Anh
Publication venue
Publication date: 01/01/2014
Field of study

Game Theory studies situations in which multiple agents having conflicting objectives have to reach a collective decision. The question of a compact representation language for agents utility function is of crucial importance since the classical representation of a

n

-players game is given by a

n

-dimensional matrix of exponential size for each player. In this paper we use the framework of Constraint Games in which CSP are used to represent utilities. Constraint Programming --including global constraints-- allows to easily give a compact and elegant model to many useful games. Constraint Games come in two flavors: Constraint Satisfaction Games and Constraint Optimization Games, the first one using satisfaction to define boolean utilities. In addition to multimatrix games, it is also possible to model more complex games where hard constraints forbid certain situations. In this paper we study complete search techniques and show that our solver using the compact representation of Constraint Games is faster than the classical game solver Gambit by one to two orders of magnitude.Comment: 17 page

arXiv.org e-Print Archive

Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates

Author: Browne Cameron
Piette Éric
Soemers Dennis J. N. J.
Stephenson Matthew
Publication venue
Publication date: 01/01/2019
Field of study

In recent years, state-of-the-art game-playing agents often involve policies that are trained in self-playing processes where Monte Carlo tree search (MCTS) algorithms and trained policies iteratively improve each other. The strongest results have been obtained when policies are trained to mimic the search behaviour of MCTS by minimising a cross-entropy loss. Because MCTS, by design, includes an element of exploration, policies trained in this manner are also likely to exhibit a similar extent of exploration. In this paper, we are interested in learning policies for a project with future goals including the extraction of interpretable strategies, rather than state-of-the-art game-playing performance. For these goals, we argue that such an extent of exploration is undesirable, and we propose a novel objective function for training policies that are not exploratory. We derive a policy gradient expression for maximising this objective function, which can be estimated using MCTS value estimates, rather than MCTS visit counts. We empirically evaluate various properties of resulting policies, in a variety of board games.Comment: Accepted at the IEEE Conference on Games (CoG) 201

arXiv.org e-Print Archive