Search CORE

40,616 research outputs found

Zero-sum stopping games with asymmetric information

Author: Gensbittel Fabien
Grün Christine
Publication venue
Publication date: 01/11/2017
Field of study

We study a model of two-player, zero-sum, stopping games with asymmetric information. We assume that the payoff depends on two continuous-time Markov chains (X, Y), where X is only observed by player 1 and Y only by player 2, implying that the players have access to stopping times with respect to different filtrations. We show the existence of a value in mixed stopping times and provide a variational characterization for the value as a function of the initial distribution of the Markov chains. We also prove a verification theorem for optimal stopping rules which allows to construct optimal stopping times. Finally we use our results to solve explicitly two generic examples

arXiv.org e-Print Archive

Toulouse Capitole Publications

Toulouse 1 Capitole Publications

Traditional Wisdom and Monte Carlo Tree Search Face-to-Face in the Card Game Scopone

Author: Di Palma Stefano
Lanzi Pier Luca
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

We present the design of a competitive artificial intelligence for Scopone, a popular Italian card game. We compare rule-based players using the most established strategies (one for beginners and two for advanced players) against players using Monte Carlo Tree Search (MCTS) and Information Set Monte Carlo Tree Search (ISMCTS) with different reward functions and simulation strategies. MCTS requires complete information about the game state and thus implements a cheating player while ISMCTS can deal with incomplete information and thus implements a fair player. Our results show that, as expected, the cheating MCTS outperforms all the other strategies; ISMCTS is stronger than all the rule-based players implementing well-known and most advanced strategies and it also turns out to be a challenging opponent for human players.Comment: Preprint. Accepted for publication in the IEEE Transaction on Game

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Synthesis of surveillance strategies via belief abstraction

Author: Bharadwaj S.
Dimitrova R.
Topcu U.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/03/2018
Field of study

We provide a novel framework for the synthesis of a controller for a robot with a surveillance objective, that is, the robot is required to maintain knowledge of the location of a moving, possibly adversarial target. We formulate this problem as a one-sided partial-information game in which the winning condition for the agent is specified as a temporal logic formula. The specification formalizes the surveillance requirement given by the user by quantifying and reasoning over the agent's beliefs about a target's location. We also incorporate additional non-surveillance tasks. In order to synthesize a surveillance strategy that meets the specification, we transform the partial-information game into a perfect-information one, using abstraction to mitigate the exponential blow-up typically incurred by such transformations. This transformation enables the use of off-the-shelf tools for reactive synthesis. We evaluate the proposed method on two case-studies, demonstrating its applicability to diverse surveillance requirements

arXiv.org e-Print Archive

Crossref

White Rose Research Online

Leicester Research Archive

Perseus: Randomized Point-based Value Iteration for POMDPs

Author: Spaan M. T. J.
Vlassis N.
Publication venue: 'AI Access Foundation'
Publication date: 09/09/2011
Field of study

Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agents belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is improved; the key observation is that a single backup may improve the value of many belief points. Contrary to other point-based methods, Perseus backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set. We show how the same idea can be extended to dealing with continuous action spaces. Experimental results show the potential of Perseus in large scale POMDP problems

arXiv.org e-Print Archive

Crossref

Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey

Author: Alsheikh Mohammad Abu
Hoang Dinh Thai
Lin Shaowei
Niyato Dusit
Tan Hwee-Pink
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/01/2015
Field of study

Wireless sensor networks (WSNs) consist of autonomous and resource-limited devices. The devices cooperate to monitor one or more physical phenomena within an area of interest. WSNs operate as stochastic systems because of randomness in the monitored environments. For long service time and low maintenance cost, WSNs require adaptive and robust methods to address data exchange, topology formulation, resource and power optimization, sensing coverage and object detection, and security challenges. In these problems, sensor nodes are to make optimized decisions from a set of accessible strategies to achieve design goals. This survey reviews numerous applications of the Markov decision process (MDP) framework, a powerful decision-making tool to develop adaptive algorithms and protocols for WSNs. Furthermore, various solution methods are discussed and compared to serve as a guide for using MDPs in WSNs

arXiv.org e-Print Archive

University of Canberra Research Repository

An Investigation Report on Auction Mechanism Design

Author: Niu Jinzhong
Parsons Simon
Publication venue
Publication date: 01/01/2009
Field of study

Auctions are markets with strict regulations governing the information available to traders in the market and the possible actions they can take. Since well designed auctions achieve desirable economic outcomes, they have been widely used in solving real-world optimization problems, and in structuring stock or futures exchanges. Auctions also provide a very valuable testing-ground for economic theory, and they play an important role in computer-based control systems. Auction mechanism design aims to manipulate the rules of an auction in order to achieve specific goals. Economists traditionally use mathematical methods, mainly game theory, to analyze auctions and design new auction forms. However, due to the high complexity of auctions, the mathematical models are typically simplified to obtain results, and this makes it difficult to apply results derived from such models to market environments in the real world. As a result, researchers are turning to empirical approaches. This report aims to survey the theoretical and empirical approaches to designing auction mechanisms and trading strategies with more weights on empirical ones, and build the foundation for further research in the field

arXiv.org e-Print Archive

CiteSeerX

City University of New York