127 research outputs found

    Homotopy Methods to Compute Equilibria in Game Theory

    Get PDF
    This paper presents a complete survey of the use of homotopy methods in game theory.Homotopies allow for a robust computation of game-theoretic equilibria and their refinements. Homotopies are also suitable to compute equilibria that are selected by variousselection theories. We present all relevant techniques underlying homotopy algorithms.We give detailed expositions of the Lemke-Howson algorithm and the Van den Elzen-Talman algorithm to compute Nash equilibria in 2-person games, and the Herings-Vanden Elzen, Herings-Peeters, and McKelvey-Palfrey algorithms to compute Nash equilibriain general n-person games.operations research and management science;

    Finite-Step Algorithms for Single-Controller and Perfect Information Stochastic Games

    Full text link
    Abstract. After a brief survey of iterative algorithms for general stochas-tic games, we concentrate on finite-step algorithms for two special classes of stochastic games. They are Single-Controller Stochastic Games and Per-fect Information Stochastic Games. In the case of single-controller games, the transition probabilities depend on the actions of the same player in all states. In perfect information stochastic games, one of the players has exactly one action in each state. Single-controller zero-sum games are effi-ciently solved by linear programming. Non-zero-sum single-controller stochastic games are reducible to linear complementary problems (LCP). In the discounted case they can be modified to fit into the so-called LCPs of Eaveā€™s class L. In the undiscounted case the LCPā€™s are reducible to Lemkeā€™s copositive plus class. In either case Lemkeā€™s algorithm can be used to find a Nash equilibrium. In the case of discounted zero-sum perfect informa-tion stochastic games, a policy improvement algorithm is presented. Many other classes of stochastic games with orderfield property still await efficient finite-step algorithms. 1

    Orderfield property of mixtures of stochastic games

    Get PDF
    We consider certain mixtures, Γ, of classes of stochastic games and provide sufficient conditions for these mixtures to possess the orderfield property. For 2-player zero-sum and non-zero sum stochastic games, we prove that if we mix a set of states S1 where the transitions are controlled by one player with a set of states S2 constituting a sub-game having the orderfield property (where S1∩S2=∅), the resulting mixture Γ with states S=S1∪S2 has the orderfield property if there are no transitions from S2 to S1. This is true for discounted as well as undiscounted games. This condition on the transitions is sufficient when S1 is perfect information or SC (Switching Control) or ARAT (Additive Reward Additive Transition). In the zero-sum case, S1 can be a mixture of SC and ARAT as well. On the other hand,when S1 is SER-SIT (Separable Reward - State Independent Transition), we provide a counter example to show that this condition is not sufficient for the mixture Γ to possess the orderfield property. In addition to the condition that there are no transitions from S2 to S1, if the sum of all transition probabilities from S1 to S2 is independent of the actions of the players, then Γ has the orderfield property even when S1 is SER-SIT. When S1 and S2 are both SERSIT, their mixture Γ has the orderfield property even if we allow transitions from S2 to S1. We also extend these results to some multi-player games namely, mixtures with one player control Polystochastic games. In all the above cases, we can inductively mix many such games and continue to retain the orderfield property

    Orderfield property and algorithms for stochastic games via dependency graphs

    Get PDF
    This article does not have an abstract

    Surveys in game theory and related topics

    Get PDF

    Stochastic Optimal Control of Grid-Level Storage

    Get PDF
    The primary focus of this dissertation is the design, analysis and implementation of stochastic optimal control of grid-level storage. It provides stochastic, quantitative models to aid decision-makers with rigorous, analytical tools that capture high uncertainty of storage control problems. The first part of the dissertation presents a pp-periodic Markov Decision Process (MDP) model, which is suitable for mitigating end-of-horizon effects. This is an extension of basic MDP, where the process follows the same pattern every pp time periods. We establish improved near-optimality bounds for a class of greedy policies, and derive a corresponding value-iteration algorithm suitable for periodic problems. A parallel implementation of the algorithm is provided on a grid-level storage control problem that involves stochastic electricity prices following a daily cycle. Additional analysis shows that the optimal policy is threshold policy. The second part of the dissertation is concerned with grid-level battery storage operations, taking battery aging phenomenon (battery degradation) into consideration. We still model the storage control problem as a MDP with an extra state variable indicating the aging status of the battery. An algorithm that takes advantage of the problem structure and works directly on the continuous state space is developed to maximize the expected cumulated discounted rewards over the life of the battery. The algorithm determines an optimal policy by solving a sequence of quasiconvex problems indexed by a battery-life state. Computational results are presented to compare the proposed approach to a standard dynamic programming method, and to evaluate the impact of refinements in the battery model. Error bounds for the proposed algorithm are established to demonstrate its accuracy. A generalization of price model to a class of Markovian regime-switching processes is also provided. The last part of this dissertation is concerned with how the ownership of energy storage make an impact on the price. Instead of one player in most storage control problems, we consider two players (consumer and supplier) in this market. Energy storage operations are modeled as an infinite-horizon Markov Game with random demand to maximize the expected discounted cumulated welfare of different players. A value iteration framework with bimatrix game embedded is provided to find equilibrium policies for players. Computational results show that the gap between optimal policies and obtained policies can be ignored. The assumption that storage levels are common knowledge is made without much loss of generality, because a learning algorithm is proposed that allows a player to ultimately identify the storage level of the other player. The expected value improvement from keeping the storage information private at the beginning of the game is then shown to be insignificant

    A Human Driver Model for Autonomous Lane Changing in Highways: Predictive Fuzzy Markov Game Driving Strategy

    Get PDF
    This study presents an integrated hybrid solution to mandatory lane changing problem to deal with accident avoidance by choosing a safe gap in highway driving. To manage this, a comprehensive treatment to a lane change active safety design is proposed from dynamics, control, and decision making aspects. My effort first goes on driver behaviors and relating human reasoning of threat in driving for modeling a decision making strategy. It consists of two main parts; threat assessment in traffic participants, (TV s) states, and decision making. The first part utilizes an complementary threat assessment of TV s, relative to the subject vehicle, SV , by evaluating the traffic quantities. Then I propose a decision strategy, which is based on Markov decision processes (MDPs) that abstract the traffic environment with a set of actions, transition probabilities, and corresponding utility rewards. Further, the interactions of the TV s are employed to set up a real traffic condition by using game theoretic approach. The question to be addressed here is that how an autonomous vehicle optimally interacts with the surrounding vehicles for a gap selection so that more effective performance of the overall traffic flow can be captured. Finding a safe gap is performed via maximizing an objective function among several candidates. A future prediction engine thus is embedded in the design, which simulates and seeks for a solution such that the objective function is maximized at each time step over a horizon. The combined system therefore forms a predictive fuzzy Markov game (FMG) since it is to perform a predictive interactive driving strategy to avoid accidents for a given traffic environment. I show the effect of interactions in decision making process by proposing both cooperative and non-cooperative Markov game strategies for enhanced traffic safety and mobility. This level is called the higher level controller. I further focus on generating a driver controller to complement the automated carā€™s safe driving. To compute this, model predictive controller (MPC) is utilized. The success of the combined decision process and trajectory generation is evaluated with a set of different traffic scenarios in dSPACE virtual driving environment. Next, I consider designing an active front steering (AFS) and direct yaw moment control (DYC) as the lower level controller that performs a lane change task with enhanced handling performance in the presence of varying front and rear cornering stiffnesses. I propose a new control scheme that integrates active front steering and the direct yaw moment control to enhance the vehicle handling and stability. I obtain the nonlinear tire forces with Pacejka model, and convert the nonlinear tire stiffnesses to parameter space to design a linear parameter varying controller (LPV) for combined AFS and DYC to perform a commanded lane change task. Further, the nonlinear vehicle lateral dynamics is modeled with Takagi-Sugeno (T-S) framework. A state-feedback fuzzy Hāˆž controller is designed for both stability and tracking reference. Simulation study confirms that the performance of the proposed methods is quite satisfactory

    Are Equivariant Equilibrium Approximators Beneficial?

    Full text link
    Recently, remarkable progress has been made by approximating Nash equilibrium (NE), correlated equilibrium (CE), and coarse correlated equilibrium (CCE) through function approximation that trains a neural network to predict equilibria from game representations. Furthermore, equivariant architectures are widely adopted in designing such equilibrium approximators in normal-form games. In this paper, we theoretically characterize benefits and limitations of equivariant equilibrium approximators. For the benefits, we show that they enjoy better generalizability than general ones and can achieve better approximations when the payoff distribution is permutation-invariant. For the limitations, we discuss their drawbacks in terms of equilibrium selection and social welfare. Together, our results help to understand the role of equivariance in equilibrium approximators.Comment: To appear in ICML 202
    • ā€¦
    corecore