3,389 research outputs found

    A Study of AI Population Dynamics with Million-agent Reinforcement Learning

    Get PDF
    We conduct an empirical study on discovering the ordered collective dynamics obtained by a population of intelligence agents, driven by million-agent reinforcement learning. Our intention is to put intelligent agents into a simulated natural context and verify if the principles developed in the real world could also be used in understanding an artificially-created intelligent population. To achieve this, we simulate a large-scale predator-prey world, where the laws of the world are designed by only the findings or logical equivalence that have been discovered in nature. We endow the agents with the intelligence based on deep reinforcement learning (DRL). In order to scale the population size up to millions agents, a large-scale DRL training platform with redesigned experience buffer is proposed. Our results show that the population dynamics of AI agents, driven only by each agent's individual self-interest, reveals an ordered pattern that is similar to the Lotka-Volterra model studied in population biology. We further discover the emergent behaviors of collective adaptations in studying how the agents' grouping behaviors will change with the environmental resources. Both of the two findings could be explained by the self-organization theory in nature.Comment: Full version of the paper presented at AAMAS 2018 (International Conference on Autonomous Agents and Multiagent Systems

    Asimovian Adaptive Agents

    Full text link
    The goal of this research is to develop agents that are adaptive and predictable and timely. At first blush, these three requirements seem contradictory. For example, adaptation risks introducing undesirable side effects, thereby making agents' behavior less predictable. Furthermore, although formal verification can assist in ensuring behavioral predictability, it is known to be time-consuming. Our solution to the challenge of satisfying all three requirements is the following. Agents have finite-state automaton plans, which are adapted online via evolutionary learning (perturbation) operators. To ensure that critical behavioral constraints are always satisfied, agents' plans are first formally verified. They are then reverified after every adaptation. If reverification concludes that constraints are violated, the plans are repaired. The main objective of this paper is to improve the efficiency of reverification after learning, so that agents have a sufficiently rapid response time. We present two solutions: positive results that certain learning operators are a priori guaranteed to preserve useful classes of behavioral assurance constraints (which implies that no reverification is needed for these operators), and efficient incremental reverification algorithms for those learning operators that have negative a priori results

    Particle velocity controls phase transitions in contagion dynamics

    Full text link
    Interactions often require the proximity between particles. The movement of particles, thus, drives the change of the neighbors which are located in their proximity, leading to a sequence of interactions. In pathogenic contagion, infections occur through proximal interactions, but at the same time the movement facilitates the co-location of different strains. We analyze how the particle velocity impacts on the phase transitions on the contagion process of both a single infection and two cooperative infections. First, we identify an optimal velocity (close to half of the interaction range normalized by the recovery time) associated with the largest epidemic threshold, such that decreasing the velocity below the optimal value leads to larger outbreaks. Second, in the cooperative case, the system displays a continuous transition for low velocities, which becomes discontinuous for velocities of the order of three times the optimal velocity. Finally, we describe these characteristic regimes and explain the mechanisms driving the dynamics.Comment: 9 pages, 5 figures, 12 supplementary figure
    corecore