12,574 research outputs found

    Abstraction refinement for games with incomplete information

    Get PDF
    Counterexample-guided abstraction refinement (CEGAR) is used in automated software analysis to find suitable finite-state abstractions of infinite-state systems. In this paper, we extend CEGAR to games with incomplete information, as they commonly occur in controller synthesis and modular verification. The challenge is that, under incomplete information, one must carefully account for the knowledge available to the player: the strategy must not depend on information the player cannot see. We propose an abstraction mechanism for games under incomplete information that incorporates the approximation of the players\' moves into a knowledge-based subset construction on the abstract state space. This abstraction results in a perfect-information game over a finite graph. The concretizability of abstract strategies can be encoded as the satisfiability of strategy-tree formulas. Based on this encoding, we present an interpolation-based approach for selecting new predicates and provide sufficient conditions for the termination of the resulting refinement loop

    Imperfect-Recall Abstractions with Bounds in Games

    Full text link
    Imperfect-recall abstraction has emerged as the leading paradigm for practical large-scale equilibrium computation in incomplete-information games. However, imperfect-recall abstractions are poorly understood, and only weak algorithm-specific guarantees on solution quality are known. In this paper, we show the first general, algorithm-agnostic, solution quality guarantees for Nash equilibria and approximate self-trembling equilibria computed in imperfect-recall abstractions, when implemented in the original (perfect-recall) game. Our results are for a class of games that generalizes the only previously known class of imperfect-recall abstractions where any results had been obtained. Further, our analysis is tighter in two ways, each of which can lead to an exponential reduction in the solution quality error bound. We then show that for extensive-form games that satisfy certain properties, the problem of computing a bound-minimizing abstraction for a single level of the game reduces to a clustering problem, where the increase in our bound is the distance function. This reduction leads to the first imperfect-recall abstraction algorithm with solution quality bounds. We proceed to show a divide in the class of abstraction problems. If payoffs are at the same scale at all information sets considered for abstraction, the input forms a metric space. Conversely, if this condition is not satisfied, we show that the input does not form a metric space. Finally, we use these results to experimentally investigate the quality of our bound for single-level abstraction

    Abstractions and sensor design in partial-information, reactive controller synthesis

    Get PDF
    Automated synthesis of reactive control protocols from temporal logic specifications has recently attracted considerable attention in various applications in, for example, robotic motion planning, network management, and hardware design. An implicit and often unrealistic assumption in this past work is the availability of complete and precise sensing information during the execution of the controllers. In this paper, we use an abstraction procedure for systems with partial observation and propose a formalism to investigate effects of limitations in sensing. The abstraction procedure enables the existing synthesis methods with partial observation to be applicable and efficient for systems with infinite (or finite but large number of) states. This formalism enables us to systematically discover sensing modalities necessary in order to render the underlying synthesis problems feasible. We use counterexamples, which witness unrealizability potentially due to the limitations in sensing and the coarseness in the abstract system, and interpolation-based techniques to refine the model and the sensing modalities, i.e., to identify new sensors to be included, in such synthesis problems. We demonstrate the method on examples from robotic motion planning.Comment: 9 pages, 4 figures, Accepted at American Control Conference 201

    No-Regret Learning in Extensive-Form Games with Imperfect Recall

    Full text link
    Counterfactual Regret Minimization (CFR) is an efficient no-regret learning algorithm for decision problems modeled as extensive games. CFR's regret bounds depend on the requirement of perfect recall: players always remember information that was revealed to them and the order in which it was revealed. In games without perfect recall, however, CFR's guarantees do not apply. In this paper, we present the first regret bound for CFR when applied to a general class of games with imperfect recall. In addition, we show that CFR applied to any abstraction belonging to our general class results in a regret bound not just for the abstract game, but for the full game as well. We verify our theory and show how imperfect recall can be used to trade a small increase in regret for a significant reduction in memory in three domains: die-roll poker, phantom tic-tac-toe, and Bluff.Comment: 21 pages, 4 figures, expanded version of article to appear in Proceedings of the Twenty-Ninth International Conference on Machine Learnin

    Dynamic real-time hierarchical heuristic search for pathfinding.

    Get PDF
    Movement of Units in Real-Time Strategy (RTS) Games is a non-trivial and challenging task mainly due to three factors which are constraints on CPU and memory usage, dynamicity of the game world, and concurrency. In this paper, we are focusing on finding a novel solution for solving the pathfinding problem in RTS Games for the units which are controlled by the computer. The novel solution combines two AI Planning approaches: Hierarchical Task Network (HTN) and Real-Time Heuristic Search (RHS). In the proposed solution, HTNs are used as a dynamic abstraction of the game map while RHS works as planning engine with interleaving of plan making and action executions. The article provides algorithmic details of the model while the empirical details of the model are obtained by using a real-time strategy game engine called ORTS (Open Real-time Strategy). The implementation of the model and its evaluation methods are in progress however the results of the automatic HTN creation are obtained for a small scale game map

    Solving Games with Functional Regret Estimation

    Full text link
    We propose a novel online learning method for minimizing regret in large extensive-form games. The approach learns a function approximator online to estimate the regret for choosing a particular action. A no-regret algorithm uses these estimates in place of the true regrets to define a sequence of policies. We prove the approach sound by providing a bound relating the quality of the function approximation and regret of the algorithm. A corollary being that the method is guaranteed to converge to a Nash equilibrium in self-play so long as the regrets are ultimately realizable by the function approximator. Our technique can be understood as a principled generalization of existing work on abstraction in large games; in our work, both the abstraction as well as the equilibrium are learned during self-play. We demonstrate empirically the method achieves higher quality strategies than state-of-the-art abstraction techniques given the same resources.Comment: AAAI Conference on Artificial Intelligence 201

    Smoothing Method for Approximate Extensive-Form Perfect Equilibrium

    Full text link
    Nash equilibrium is a popular solution concept for solving imperfect-information games in practice. However, it has a major drawback: it does not preclude suboptimal play in branches of the game tree that are not reached in equilibrium. Equilibrium refinements can mend this issue, but have experienced little practical adoption. This is largely due to a lack of scalable algorithms. Sparse iterative methods, in particular first-order methods, are known to be among the most effective algorithms for computing Nash equilibria in large-scale two-player zero-sum extensive-form games. In this paper, we provide, to our knowledge, the first extension of these methods to equilibrium refinements. We develop a smoothing approach for behavioral perturbations of the convex polytope that encompasses the strategy spaces of players in an extensive-form game. This enables one to compute an approximate variant of extensive-form perfect equilibria. Experiments show that our smoothing approach leads to solutions with dramatically stronger strategies at information sets that are reached with low probability in approximate Nash equilibria, while retaining the overall convergence rate associated with fast algorithms for Nash equilibrium. This has benefits both in approximate equilibrium finding (such approximation is necessary in practice in large games) where some probabilities are low while possibly heading toward zero in the limit, and exact equilibrium computation where the low probabilities are actually zero.Comment: Published at IJCAI 1

    Controllability in partial and uncertain environments

    Get PDF
    © 2014 IEEE.Controller synthesis is a well studied problem that attempts to automatically generate an operational behaviour model of the system-to-be that satisfies a given goal when deployed in a given domain model that behaves according to specified assumptions. A limitation of many controller synthesis techniques is that they require complete descriptions of the problem domain. This is limiting in the context of modern incremental development processes when a fully described problem domain is unavailable, undesirable or uneconomical. Previous work on Modal Transition Systems (MTS) control problems exists, however it is restricted to deterministic MTSs and deterministic Labelled Transition Systems (LTS) implementations. In this paper we study the Modal Transition System Control Problem in its full generality, allowing for nondeterministic MTSs modelling the environments behaviour and nondeterministic LTS implementations. Given an nondeterministic MTS we ask if all, none or some of the nondeterministic LTSs it describes admit an LTS controller that guarantees a given property. We show a technique that solves effectively the MTS realisability problem and it can be, in some cases, reduced to deterministic control problems. In all cases the MTS realisability problem is in same complexity class as the corresponding LTS problem
    • …
    corecore