1,195 research outputs found

    A survey of random processes with reinforcement

    Full text link
    The models surveyed include generalized P\'{o}lya urns, reinforced random walks, interacting urn models, and continuous reinforced processes. Emphasis is on methods and results, with sketches provided of some proofs. Applications are discussed in statistics, biology, economics and a number of other areas.Comment: Published at http://dx.doi.org/10.1214/07-PS094 in the Probability Surveys (http://www.i-journals.org/ps/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A pseudo-polynomial algorithm for mean payoff stochastic games with perfect information and few random positions

    Get PDF
    We consider two-person zero-sum stochastic mean payoff games with perfect information, or BWR-games, given by a digraph G = (V;E), with local rewards r : E Z, and three types of positions: black VB, white VW, and random VR forming a partition of V . It is a long- standing open question whether a polynomial time algorithm for BWR-games exists, or not, even when |VR| = 0. In fact, a pseudo-polynomial algorithm for BWR-games would already imply their polynomial solvability. In this paper, we show that BWR-games with a constant number of random positions can be solved in pseudo-polynomial time. More precisely, in any BWR-game with |VR| = O(1), a saddle point in uniformly optimal pure stationary strategies can be found in time polynomial in |VW| + |VB|, the maximum absolute local reward, and the common denominator of the transition probabilities

    Distributed Power Control Techniques Based on Game Theory for Wideband Wireless Networks

    Get PDF
    This thesis describes a theoretical framework for the design and the analysis of distributed (decentralized) power control algorithms for high-throughput wireless networks using ultrawideband (UWB) technologies. The tools of game theory are shown to be expedient for deriving scalable, energy-efficient, distributed power control schemes to be applied to a population of battery-operated user terminals in a rich multipath environment. In particular, the power control issue is modeled as a noncooperative game in which each user chooses its transmit power so as to maximize its own utility, which is defined as the ratio of throughput to transmit power. Although distributed (noncooperative) control is known to be suboptimal with respect to the optimal centralized (cooperative) solution, it is shown via large-system analysis that the game-theoretic distributed algorithm based on Nash equilibrium exhibits negligible performance degradation with respect to the centralized socially optimal configuration. The framework described here is general enough to also encompass the analysis of code division multiple access (CDMA) systems and to show that UWB slightly outperforms CDMA in terms of achieved utility at the Nash equilibrium

    Stochastic Game Approach to Air Operations

    Get PDF
    A Command and Control (C2) problem for Military Air Operations is addressed. Specifically, we consider C2 problems for air vehicles against ground based targets and defensive systems. The problem is viewed as a stochastic game. In this paper, we restrict our attention to the C2 level where the problem may consist of a few UCAVs or aircraft (or possibly teams of vehicles); less than say, a half-dozen enemy SAMs; a few enemy assets (viewed as targets from our standpoint); and some enemy decoys (assumed to mimic SAM radar signatures). At this low level, some targets are mapped out and possible SAM sites that are unavoidably part of the situation are known. One may then employ a discrete stochastic game problem formulation to determine which of these SAMs should optimally be engaged (if any), and by what series of air vehicle operations. Since this is a game model, the optimal opponent strategy is also determined. We provide analysis, numerical implementation, and simulation for full state feedback and measurement feedback control within this C2 context

    Data based identification and prediction of nonlinear and complex dynamical systems

    Get PDF
    We thank Dr. R. Yang (formerly at ASU), Dr. R.-Q. Su (formerly at ASU), and Mr. Zhesi Shen for their contributions to a number of original papers on which this Review is partly based. This work was supported by ARO under Grant No. W911NF-14-1-0504. W.-X. Wang was also supported by NSFC under Grants No. 61573064 and No. 61074116, as well as by the Fundamental Research Funds for the Central Universities, Beijing Nova Programme.Peer reviewedPostprin

    Incentive Stackelberg Mean-payoff Games

    Get PDF
    We introduce and study incentive equilibria for multi-player meanpayoff games. Incentive equilibria generalise well-studied solution concepts such as Nash equilibria and leader equilibria (also known as Stackelberg equilibria). Recall that a strategy profile is a Nash equilibrium if no player can improve his payoff by changing his strategy unilaterally. In the setting of incentive and leader equilibria, there is a distinguished player called the leader who can assign strategies to all other players, referred to as her followers. A strategy profile is a leader strategy profile if no player, except for the leader, can improve his payoff by changing his strategy unilaterally, and a leader equilibrium is a leader strategy profile with a maximal return for the leader. In the proposed case of incentive equilibria, the leader can additionally influence the behaviour of her followers by transferring parts of her payoff to her followers. The ability to incentivise her followers provides the leader with more freedom in selecting strategy profiles, and we show that this can indeed improve the payoff for the leader in such games. The key fundamental result of the paper is the existence of incentive equilibria in mean-payoff games. We further show that the decision problem related to constructing incentive equilibria is NP-complete. On a positive note, we show that, when the number of players is fixed, the complexity of the problem falls in the same class as two-player mean-payoff games. We also present an implementation of the proposed algorithms, and discuss experimental results that demonstrate the feasibility of the analysis of medium sized games.Comment: 15 pages, references, appendix, 5 figure

    Bayesian Network Games

    Get PDF
    This thesis builds from the realization that Bayesian Nash equilibria are the natural definition of optimal behavior in a network of distributed autonomous agents. Game equilibria are often behavior models of competing rational agents that take actions that are strategic reactions to the predicted actions of other players. In autonomous systems however, equilibria are used as models of optimal behavior for a different reason: Agents are forced to play strategically against inherent uncertainty. While it may be that agents have conflicting intentions, more often than not, their goals are aligned. However, barring unreasonable accuracy of environmental information and unjustifiable levels of coordination, they still can\u27t be sure of what the actions of other agents will be. Agents have to focus their strategic reasoning on what they believe the information available to other agents is, how they think other agents will respond to this hypothetical information, and choose what they deem to be their best response to these uncertain estimates. If agents model the behavior of each other as equally strategic, the optimal response of the network as a whole is a Bayesian Nash equilibrium. We say that the agents are playing a Bayesian network game when they repeatedly act according to a stage Bayesian Nash equilibrium and receive information from their neighbors in the network. The first part of the thesis is concerned with the development and analysis of algorithms that agents can use to compute their equilibrium actions in a game of incomplete information with repeated interactions over a network. In this regard, the burden of computing a Bayesian Nash equilibrium in repeated games is, in general, overwhelming. This thesis shows that actions are computable in the particular case when the local information that agents receive follows a Gaussian distribution and the game\u27s payoff is represented by a utility function that is quadratic in the actions of all agents and an unknown parameter. This solution comes in the form of the Quadratic Network Game filter that agents can run locally, i.e., without access to all private signals, to compute their equilibrium actions. For the more generic payoff case of Bayesian potential games, i.e., payoffs represented by a potential function that depends on population actions and an unknown state of the world, distributed versions of fictitious play that converge to Nash equilibrium with identical beliefs on the state are derived. This algorithm highlights the fact that in order to determine optimal actions there are two problems that have to be solved: (i) Construction of a belief on the state of the world and the actions of other agents. (ii) Determination of optimal responses to the acquired beliefs. In the case of symmetric and strictly supermodular games, i.e., games with coordination incentives, the thesis also derives qualitative properties of Bayesian network games played in the time limit. In particular, we ask whether agents that play and observe equilibrium actions are able to coordinate on an action and learn about others\u27 behavior from only observing peers\u27 actions. The analysis described here shows that agents eventually coordinate on a consensus action. The second part of this thesis considers the application of the algorithms developed in the first part to the analysis of energy markets. Consumer demand profiles and fluctuating renewable power generation are two main sources of uncertainty in matching demand and supply in an energy market. We propose a model of the electricity market that captures the uncertainties on both, the operator and the user side. The system operator (SO) implements a temporal linear pricing strategy that depends on real-time demand and renewable generation in the considered period combining Real-Time Pricing with Time-of-Use Pricing. The announced pricing strategy sets up a noncooperative game of incomplete information among the users with heterogeneous but correlated consumption preferences. An explicit characterization of the optimal user behavior using the Bayesian Nash equilibrium solution concept is derived. This explicit characterization allows the SO to derive pricing policies that influence demand to serve practical objectives such as minimizing peak-to-average ratio or attaining a desired rate of return. Numerical experiments show that the pricing policies yield close to optimal welfare values while improving these practical objectives. We then analyze the sensitivity of the proposed pricing schemes to user behavior and information exchange models. Selfish, altruistic and welfare maximizing user behavior models are considered. Furthermore, information exchange models in which users only have private information, communicate or receive broadcasted information are considered. For each pair of behavior and information exchange models, rational price anticipating consumption strategy is characterized. In all of the information exchange models, equilibrium actions can be computed using the Quadratic Network Game filter. Further experiments reveal that communication model is beneficial for the expected aggregate payoff while it does not affect the expected net revenue of the system operator. Moreover, additional information to the users reduces the variance of total consumption among runs, increasing the accuracy of demand predictions
    corecore