10,172 research outputs found

    Utility Design for Distributed Resource Allocation -- Part I: Characterizing and Optimizing the Exact Price of Anarchy

    Full text link
    Game theory has emerged as a fruitful paradigm for the design of networked multiagent systems. A fundamental component of this approach is the design of agents' utility functions so that their self-interested maximization results in a desirable collective behavior. In this work we focus on a well-studied class of distributed resource allocation problems where each agent is requested to select a subset of resources with the goal of optimizing a given system-level objective. Our core contribution is the development of a novel framework to tightly characterize the worst case performance of any resulting Nash equilibrium (price of anarchy) as a function of the chosen agents' utility functions. Leveraging this result, we identify how to design such utilities so as to optimize the price of anarchy through a tractable linear program. This provides us with a priori performance certificates applicable to any existing learning algorithm capable of driving the system to an equilibrium. Part II of this work specializes these results to submodular and supermodular objectives, discusses the complexity of computing Nash equilibria, and provides multiple illustrations of the theoretical findings.Comment: 15 pages, 5 figure

    Optimizing collective fieldtaxis of swarming agents through reinforcement learning

    Full text link
    Swarming of animal groups enthralls scientists in fields ranging from biology to physics to engineering. Complex swarming patterns often arise from simple interactions between individuals to the benefit of the collective whole. The existence and success of swarming, however, nontrivially depend on microscopic parameters governing the interactions. Here we show that a machine-learning technique can be employed to tune these underlying parameters and optimize the resulting performance. As a concrete example, we take an active matter model inspired by schools of golden shiners, which collectively conduct phototaxis. The problem of optimizing the phototaxis capability is then mapped to that of maximizing benefits in a continuum-armed bandit game. The latter problem accepts a simple reinforcement-learning algorithm, which can tune the continuous parameters of the model. This result suggests the utility of machine-learning methodology in swarm-robotics applications.Comment: 6 pages, 3 figure

    Optimizing evacuation flow in a two-channel exclusion process

    Full text link
    We use a basic setup of two coupled exclusion processes to model a stylised situation in evacuation dynamics, in which evacuees have to choose between two escape routes. The coupling between the two processes occurs through one common point at which particles are injected, the process can be controlled by directing incoming individuals into either of the two escape routes. Based on a mean-field approach we determine the phase behaviour of the model, and analytically compute optimal control strategies, maximising the total current through the system. Results are confirmed by numerical simulations. We also show that dynamic intervention, exploiting fluctuations about the mean-field stationary state, can lead to a further increase in total current.Comment: 16 pages, 6 figure

    An Adversarial Interpretation of Information-Theoretic Bounded Rationality

    Full text link
    Recently, there has been a growing interest in modeling planning with information constraints. Accordingly, an agent maximizes a regularized expected utility known as the free energy, where the regularizer is given by the information divergence from a prior to a posterior policy. While this approach can be justified in various ways, including from statistical mechanics and information theory, it is still unclear how it relates to decision-making against adversarial environments. This connection has previously been suggested in work relating the free energy to risk-sensitive control and to extensive form games. Here, we show that a single-agent free energy optimization is equivalent to a game between the agent and an imaginary adversary. The adversary can, by paying an exponential penalty, generate costs that diminish the decision maker's payoffs. It turns out that the optimal strategy of the adversary consists in choosing costs so as to render the decision maker indifferent among its choices, which is a definining property of a Nash equilibrium, thus tightening the connection between free energy optimization and game theory.Comment: 7 pages, 4 figures. Proceedings of AAAI-1

    Automatic Curriculum Learning For Deep RL: A Short Survey

    Full text link
    Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in Deep Reinforcement Learning (DRL).These methods shape the learning trajectories of agents by challenging them with tasks adapted to their capacities. In recent years, they have been used to improve sample efficiency and asymptotic performance, to organize exploration, to encourage generalization or to solve sparse reward problems, among others. The ambition of this work is dual: 1) to present a compact and accessible introduction to the Automatic Curriculum Learning literature and 2) to draw a bigger picture of the current state of the art in ACL to encourage the cross-breeding of existing concepts and the emergence of new ideas.Comment: Accepted at IJCAI202
    corecore