9 research outputs found

    Doing the right thing for the right reason: Evaluating artificial moral cognition by probing cost insensitivity

    Full text link
    Is it possible to evaluate the moral cognition of complex artificial agents? In this work, we take a look at one aspect of morality: `doing the right thing for the right reasons.' We propose a behavior-based analysis of artificial moral cognition which could also be applied to humans to facilitate like-for-like comparison. Morally-motivated behavior should persist despite mounting cost; by measuring an agent's sensitivity to this cost, we gain deeper insight into underlying motivations. We apply this evaluation to a particular set of deep reinforcement learning agents, trained by memory-based meta-reinforcement learning. Our results indicate that agents trained with a reward function that includes other-regarding preferences perform helping behavior in a way that is less sensitive to increasing cost than agents trained with more self-interested preferences.Comment: 11 pages, 3 figure

    Melting Pot 2.0

    Full text link
    Multi-agent artificial intelligence research promises a path to develop intelligent technologies that are more human-like and more human-compatible than those produced by "solipsistic" approaches, which do not consider interactions between agents. Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios. Each scenario pairs a physical environment (a "substrate") with a reference set of co-players (a "background population"), to create a social situation with substantial interdependence between the individuals involved. For instance, some scenarios were inspired by institutional-economics-based accounts of natural resource management and public-good-provision dilemmas. Others were inspired by considerations from evolutionary biology, game theory, and artificial life. Melting Pot aims to cover a maximally diverse set of interdependencies and incentives. It includes the commonly-studied extreme cases of perfectly-competitive (zero-sum) motivations and perfectly-cooperative (shared-reward) motivations, but does not stop with them. As in real-life, a clear majority of scenarios in Melting Pot have mixed incentives. They are neither purely competitive nor purely cooperative and thus demand successful agents be able to navigate the resulting ambiguity. Here we describe Melting Pot 2.0, which revises and expands on Melting Pot. We also introduce support for scenarios with asymmetric roles, and explain how to integrate them into the evaluation protocol. This report also contains: (1) details of all substrates and scenarios; (2) a complete description of all baseline algorithms and results. Our intention is for it to serve as a reference for researchers using Melting Pot 2.0.Comment: 59 pages, 54 figures. arXiv admin note: text overlap with arXiv:2107.0685

    Dynamics of the system in the vicinity of (top), (middle) and (bottom).

    Full text link
    <p>The horizontal axis corresponds to the value of . The vertical axis corresponds to the value of . Isoclines represent the proportion of runs converging to corruption (red) and righteousness (blue). All runs that do not converge to either corruption or righteousness end up in defection (white).</p

    Conditions for stability of the four corners of the simplex.

    Full text link
    <p>If the condition is satisfied, then the direction pointed by the arrow behaves as a local attractor. is always stable, denoted by the filled circle, while is always unstable, denoted by the open circle. While many equilibria at the edges of the simplex may be stable in the reduced games, we reserve filled circles to indicate globally stable equilibria (i.e. equilibria that are stable in the full game with the four strategies.).</p

    Stability of the three main equilibria on the system as a function of parameters and .

    Full text link
    <p>The white area corresponds to the cases in which defection is the only globally stable equilibrium. Notice that there is an area where righteousness and corruption intersect, in this region, all three main equilibria are stable. Depicted are representative cases for each of the four areas. While the position of the main equilibria might change and existence of other (unstable) internal equilibria in some edges might exist for specific parameter combinations, the qualitative dynamics are captured by these depicted cases. For simplicity, internal equilibria in the faces of the simplex are not drawn. All internal equilibria in the faces are unstable (see Appendix).</p

    <b>Supplemental Material - A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings</b>

    Full text link
    Supplemental Material for A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings by Eugene Vinitsky, Raphael Köster, John P Agapiou, Edgar A Duéñez-Guzmán, Alexander S Vezhnevets and Joel Z Leibo in Collective Intelligence</p

    Data from: Fitness trade-offs explain low levels of persister cells in the opportunistic pathogen Pseudomonas aeruginosa

    Full text link
    Microbial populations often contain a fraction of slow-growing persister cells that withstand antibiotics and other stress factors. Current theoretical models predict that persistence levels should reflect a stable state in which the survival advantage of persisters under adverse conditions is balanced with the direct growth cost impaired under favourable growth conditions, caused by the nonreplication of persister cells. Based on this direct growth cost alone, however, it remains challenging to explain the observed low levels of persistence (<<1%) seen in the populations of many species. Here, we present data from the opportunistic human pathogen Pseudomonas aeruginosa that can explain this discrepancy by revealing various previously unknown costs of persistence. In particular, we show that in the absence of antibiotic stress, increased persistence is traded off against a lengthened lag phase as well as a reduced survival ability during stationary phase. We argue that these pleiotropic costs contribute to the very low proportions of persister cells observed among natural P. aeruginosa isolates (3 × 10−8–3 × 10−4) and that they can explain why strains with higher proportions of persister cells lose out very quickly in competition assays under favourable growth conditions, despite a negligible difference in maximal growth rate. We discuss how incorporating these trade-offs could lead to models that can better explain the evolution of persistence in nature and facilitate the rational design of alternative therapeutic strategies for treating infectious diseases
    corecore