21 research outputs found

    Multi-Agent Learning in Contextual Games under Unknown Constraints

    Full text link
    We consider the problem of learning to play a repeated contextual game with unknown reward and unknown constraints functions. Such games arise in applications where each agent's action needs to belong to a feasible set, but the feasible set is a priori unknown. For example, in constrained multi-agent reinforcement learning, the constraints on the agents' policies are a function of the unknown dynamics and hence, are themselves unknown. Under kernel-based regularity assumptions on the unknown functions, we develop a no-regret, no-violation approach which exploits similarities among different reward and constraint outcomes. The no-violation property ensures that the time-averaged sum of constraint violations converges to zero as the game is repeated. We show that our algorithm, referred to as c.z.AdaNormalGP, obtains kernel-dependent regret bounds and that the cumulative constraint violations have sublinear kernel-dependent upper bounds. In addition we introduce the notion of constrained contextual coarse correlated equilibria (c.z.CCE) and show that Ï”\epsilon-c.z.CCEs can be approached whenever players' follow a no-regret no-violation strategy. Finally, we experimentally demonstrate the effectiveness of c.z.AdaNormalGP on an instance of multi-agent reinforcement learning

    Stacky Abelianization of an Algebraic Group

    Full text link
    Let G be a connected algebraic group and let [G,G] be its commutator subgroup. We prove a conjecture of Drinfeld about the existence of a connected etale group cover H of [G,G], characterized by the following properties: every central extension of G, by a finite etale group scheme, splits over H, and the commutator map of G lifts to H. We prove, moreover, that the quotient stack of G by the natural action of H is the universal Deligne-Mumford Picard stack to which G maps.Comment: 22 Page

    Optimal feedback control for dynamic systems with state constraints: An exact penalty approach

    Get PDF
    In this paper, we consider a class of nonlinear dynamic systems with terminal state and continuous inequality constraints. Our aim is to design an optimal feedback controller that minimizes total system cost and ensures satisfaction of all constraints. We first formulate this problem as a semi-infinite optimization problem. We then show that by using a new exact penalty approach, this semi-infinite optimization problem can be converted into a sequence of nonlinear programming problems, each of which can be solved using standard gradient-based optimization methods.We conclude the paper by discussing applications of our work to glider control

    Optimal Control of Nonlinear Switched Systems: Computational Methods and Applications

    Get PDF
    A switched system is a dynamic system that operates by switching between different subsystems or modes. Such systems exhibit both continuous and discrete characteristics—a dual nature that makes designing effective control policies a challenging task. The purpose of this paper is to review some of the latest computational techniques for generating optimal control laws for switched systems with nonlinear dynamics and continuous inequality constraints. We discuss computational strategiesfor optimizing both the times at which a switched system switches from one mode to another (the so-called switching times) and the sequence in which a switched system operates its various possible modes (the so-called switching sequence). These strategies involve novel combinations of the control parameterization method, the timescaling transformation, and bilevel programming and binary relaxation techniques. We conclude the paper by discussing a number of switched system optimal control models arising in practical applications

    On the image of the parabolic Hitchin map

    No full text
    We determine the image of the (strongly) parabolic Hitchin map for all parabolics in classical groups and G₂. Surprisingly, we find that the image is isomorphic to an affine space in all cases, except for certain ‘bad parabolics’ in type D, where the image can be singular.David Baraglia and Masoud Kamgarpou

    Payoff-Based Approach to Learning Nash Equilibria in Convex Games

    No full text
    We consider multi-agent decision making, where each agent optimizes its cost function subject to constraints. Agents' actions belong to a compact convex Euclidean space and the agents' cost functions are coupled. We propose a distributed payoff-based algorithm to learn Nash equilibria in the game between agents. Each agent uses only information about its current cost value to compute its next action. We prove convergence of the proposed algorithm to a Nash equilibrium in the game leveraging established results on stochastic processes. The performance of the algorithm is analyzed with a numerical case study

    On aggregative and mean field games with applications to electricity markets

    No full text
    We study the existence and uniqueness of Nash equilibria for a certain class of aggregative games with finite and possibly large number of players. Sufficient conditions for these are obtained using the theory of variational inequalities together with the specific structure of the objective functions. We further present an algorithm that converges to the Nash equilibrium in a decentralized fashion with provable guarantees. The theoretical results are applied to the problem of managing the charging of a large fleet of plug-in electric vehicles and the results are compared with the existing work

    On the range of feasible power trajectories for a population of thermostatically controlled loads

    No full text
    We study the potential of a population of thermostatically controlled loads to track desired power signals with provable guarantees. Based on connecting the temperature state of an individual device with its internal energy, we derive necessary conditions that a given power signal needs to satisfy in order for the aggregation of devices to track it using non-disruptive probabilistic switching control. Our derivation takes into account hybrid individual dynamics, an accurate continuous-time Markov chain model for the population dynamics and bounds on switching rates of individual devices. We illustrate the approach with case studies

    On the Equivalence of Youla, System-level and Input-output Parameterizations

    No full text
    A convex parameterization of internally stabilizing controllers is fundamental for many controller synthesis procedures. The celebrated Youla parameterization relies on a doubly coprime factorization of the system, while the recent system-level and input-output characterizations require no doubly-coprime factorization but a set of equality constraints for achievable closed-loop responses. In this paper, we present explicit affine mappings among Youla, system-level and input-output parameterizations. Two direct implications of the affine mappings are 1) any convex problem in Youla, system-level, or input-output parameters can be equivalently and convexly formulated in any other one of these frameworks, including the convex system-level synthesis (SLS); 2) the condition of quadratic invariance (QI) is sufficient and necessary for the classical distributed control problem to admit an equivalent convex reformulation in terms of Youla, system-level, or input-output parameters
    corecore