2 research outputs found

    Cooperation and Reputation Dynamics with Reinforcement Learning

    Get PDF
    Creating incentives for cooperation is a challenge in natural and artificial systems. One potential answer is reputation, whereby agents trade the immediate cost of cooperation for the future benefits of having a good reputation. Game theoretical models have shown that specific social norms can make cooperation stable, but how agents can independently learn to establish effective reputation mechanisms on their own is less understood. We use a simple model of reinforcement learning to show that reputation mechanisms generate two coordination problems: agents need to learn how to coordinate on the meaning of existing reputations and collectively agree on a social norm to assign reputations to others based on their behavior. These coordination problems exhibit multiple equilibria, some of which effectively establish cooperation. When we train agents with a standard Q-learning algorithm in an environment with the presence of reputation mechanisms, convergence to undesirable equilibria is widespread. We propose two mechanisms to alleviate this: (i) seeding a proportion of the system with fixed agents that steer others towards good equilibria; and (ii), intrinsic rewards based on the idea of introspection, i.e., augmenting agents' rewards by an amount proportionate to the performance of their own strategy against themselves. A combination of these simple mechanisms is successful in stabilizing cooperation, even in a fully decentralized version of the problem where agents learn to use and assign reputations simultaneously. We show how our results relate to the literature in Evolutionary Game Theory, and discuss implications for artificial, human and hybrid systems, where reputations can be used as a way to establish trust and cooperation.Comment: Published in AAMAS'21, 9 page

    Walk the Talk! Exploring (Mis)Alignment of Words and Deeds by Robotic Teammates in a Public Goods Game

    No full text
    This paper explores how robotic teammates can enhance and promote cooperation in collaborative settings. It presents a user study in which participants engaged with two fully autonomous robotic partners to play a game together, named "For The Record", a variation of a public goods game. The game is played for a total of five rounds and in each of them, players face a social dilemma: to cooperate i.e., contributing towards the team's goal while compromising individual benefits, or to defect i.e., favouring individual benefits over the team's goal. Each participant collaborates with two robotic partners that adopt opposite strategies to play the game: one of them is an unconditional cooperator (the pro-social robot), and the other is an unconditional defector (the selfish robot). In a between-subjects design, we manipulated which of the two robots criticizes behaviours, which consists of condemning participants when they opt to defect, and it represents either an alignment or a misalignment of words and deeds by the robot. Two main findings should be highlighted (1) the misalignment of words and deeds may affect the level of discomfort perceived on a robotic partner; (2) the perception a human has of a robotic partner that criticizes him is not damaged as long as the robot displays an alignment of words and deeds
    corecore