404 research outputs found
Efficiently Characterizing Games Consistent with Perturbed Equilibrium Observations
In this thesis, we study the problem of characterizing the set of games that are consistent with observed equilibrium play, a fundamental problem in econometrics. Our contribution is to develop and analyze a new methodology based on convex optimization to address this problem, for many classes of games and observation models of interest. Our approach provides a sharp, computationally efficient characterization of the extent to which a particular set of observations constrains the space of games that could have generated them. This allows us to solve a number of variants of this problem as well as to quantify the power of games from particular classes (e.g., zero-sum, potential, linearly parameterized) to explain player behavior.
We illustrate our approach with numerical simulations.</p
Inverse Game Theory for Stackelberg Games: the Blessing of Bounded Rationality
Optimizing strategic decisions (a.k.a. computing equilibrium) is key to the
success of many non-cooperative multi-agent applications. However, in many
real-world situations, we may face the exact opposite of this game-theoretic
problem -- instead of prescribing equilibrium of a given game, we may directly
observe the agents' equilibrium behaviors but want to infer the underlying
parameters of an unknown game. This research question, also known as inverse
game theory, has been studied in multiple recent works in the context of
Stackelberg games. Unfortunately, existing works exhibit quite negative
results, showing statistical hardness and computational hardness, assuming
follower's perfectly rational behaviors. Our work relaxes the perfect
rationality agent assumption to the classic quantal response model, a more
realistic behavior model of bounded rationality. Interestingly, we show that
the smooth property brought by such bounded rationality model actually leads to
provably more efficient learning of the follower utility parameters in general
Stackelberg games. Systematic empirical experiments on synthesized games
confirm our theoretical results and further suggest its robustness beyond the
strict quantal response model
Discovering How Agents Learn Using Few Data
Decentralized learning algorithms are an essential tool for designing
multi-agent systems, as they enable agents to autonomously learn from their
experience and past interactions. In this work, we propose a theoretical and
algorithmic framework for real-time identification of the learning dynamics
that govern agent behavior using a short burst of a single system trajectory.
Our method identifies agent dynamics through polynomial regression, where we
compensate for limited data by incorporating side-information constraints that
capture fundamental assumptions or expectations about agent behavior. These
constraints are enforced computationally using sum-of-squares optimization,
leading to a hierarchy of increasingly better approximations of the true agent
dynamics. Extensive experiments demonstrated that our approach, using only 5
samples from a short run of a single trajectory, accurately recovers the true
dynamics across various benchmarks, including equilibrium selection and
prediction of chaotic systems up to 10 Lyapunov times. These findings suggest
that our approach has significant potential to support effective policy and
decision-making in strategic multi-agent systems
Continuous Limits for Constrained Ensemble Kalman Filter
The Ensemble Kalman Filter method can be used as an iterative particle
numerical scheme for state dynamics estimation and control--to--observable
identification problems. In applications it may be required to enforce the
solution to satisfy equality constraints on the control space. In this work we
deal with this problem from a constrained optimization point of view, deriving
corresponding optimality conditions. Continuous limits, in time and in the
number of particles, allows us to study properties of the method. We illustrate
the performance of the method by using test inverse problems from the
literature
Mapping the landscape of metabolic goals of a cell
Genome-scale flux balance models of metabolism provide testable predictions of all metabolic rates in an organism, by assuming that the cell is optimizing a metabolic goal known as the objective function. We introduce an efficient inverse flux balance analysis (invFBA) approach, based on linear programming duality, to characterize the space of possible objective functions compatible with measured fluxes. After testing our algorithm on simulated E. coli data and time-dependent S. oneidensis fluxes inferred from gene expression, we apply our inverse approach to flux measurements in long-term evolved E. coli strains, revealing objective functions that provide insight into metabolic adaptation trajectories.MURI W911NF-12-1-0390 - Army Research Office (US); MURI W911NF-12-1-0390 - Army Research Office (US); 5R01GM089978-02 - National Institutes of Health (US); IIS-1237022 - National Science Foundation (US); DE-SC0012627 - U.S. Department of Energy; HR0011-15-C-0091 - Defense Sciences Office, DARPA; National Institutes of Health; R01GM103502; 5R01DE024468; 1457695 - National Science Foundatio
Recovering the Sunk Costs of R&D: the Moulds Industry Case
Sunk costs for R&D are an important determinant of the level of innovation in the economy. In this paper I recover them using a Markov equilibrium framework. The contribution is twofold. First, a model of industry dynamics which accounts for selection into R&D, capital accumulation and entry/exit is proposed. The industry state is summarized by an aggregate state with the advantage that it avoids the "curse of dimensionality". Second, the estimated sunk costs of R&D for the Portuguese moulds industry are shown to be important (3.4 million Euros). They become particularly relevant since the industry is mostly populated by small firms. Institutional changes in the early 1990s generated an increase in demand from European car makers and created the incentives for firms to pay the costs of investment. Trade-induced innovation reinforced the selection effect by which international trade leads to productivity growth. Finally, using the estimated parameters, simulations evaluate the effects of changes in market size, sunk costs and entry costs.Aggregate state, industry dynamics, Markov equilibrium, moulds industry, R&D, structural estimation, sunk costs
Scalable Online Learning of Approximate Stackelberg Solutions in Energy Trading Games with Demand Response Aggregators
In this work, a Stackelberg game theoretic framework is proposed for trading
energy bidirectionally between the demand-response (DR) aggregator and the
prosumers. This formulation allows for flexible energy arbitrage and additional
monetary rewards while ensuring that the prosumers' desired daily energy demand
is met. Then, a scalable (with the number of prosumers) approach is proposed to
find approximate equilibria based on online sampling and learning of the
prosumers' cumulative best response. Moreover, bounds are provided on the
quality of the approximate equilibrium solution. Last, real-world data from the
California day-ahead energy market and the University of California at Davis
building energy demands are utilized to demonstrate the efficacy of the
proposed framework and the online scalable solution.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies
Gradient-based methods have been widely used for system design and
optimization in diverse application domains. Recently, there has been a renewed
interest in studying theoretical properties of these methods in the context of
control and reinforcement learning. This article surveys some of the recent
developments on policy optimization, a gradient-based iterative approach for
feedback control synthesis, popularized by successes of reinforcement learning.
We take an interdisciplinary perspective in our exposition that connects
control theory, reinforcement learning, and large-scale optimization. We review
a number of recently-developed theoretical results on the optimization
landscape, global convergence, and sample complexity of gradient-based methods
for various continuous control problems such as the linear quadratic regulator
(LQR), control, risk-sensitive control, linear quadratic
Gaussian (LQG) control, and output feedback synthesis. In conjunction with
these optimization results, we also discuss how direct policy optimization
handles stability and robustness concerns in learning-based control, two main
desiderata in control engineering. We conclude the survey by pointing out
several challenges and opportunities at the intersection of learning and
control.Comment: To Appear in Annual Review of Control, Robotics, and Autonomous
System
Dynamics in near-potential games
We consider discrete-time learning dynamics in finite strategic form games, and show that games that are close to a potential game inherit many of the dynamical properties of potential games. We first study the evolution of the sequence of pure strategy profiles under better/best response dynamics. We show that this sequence converges to a (pure) approximate equilibrium set whose size is a function of the “distance” to a given nearby potential game. We then focus on logit response dynamics, and provide a characterization of the limiting outcome in terms of the distance of the game to a given potential game and the corresponding potential function. Finally, we turn attention to fictitious play, and establish that in near-potential games the sequence of empirical frequencies of player actions converges to a neighborhood of (mixed) equilibria, where the size of the neighborhood increases according to the distance to the set of potential games
- …