15 research outputs found

    Cooperation and Reputation Dynamics with Reinforcement Learning

    Get PDF
    Creating incentives for cooperation is a challenge in natural and artificial systems. One potential answer is reputation, whereby agents trade the immediate cost of cooperation for the future benefits of having a good reputation. Game theoretical models have shown that specific social norms can make cooperation stable, but how agents can independently learn to establish effective reputation mechanisms on their own is less understood. We use a simple model of reinforcement learning to show that reputation mechanisms generate two coordination problems: agents need to learn how to coordinate on the meaning of existing reputations and collectively agree on a social norm to assign reputations to others based on their behavior. These coordination problems exhibit multiple equilibria, some of which effectively establish cooperation. When we train agents with a standard Q-learning algorithm in an environment with the presence of reputation mechanisms, convergence to undesirable equilibria is widespread. We propose two mechanisms to alleviate this: (i) seeding a proportion of the system with fixed agents that steer others towards good equilibria; and (ii), intrinsic rewards based on the idea of introspection, i.e., augmenting agents' rewards by an amount proportionate to the performance of their own strategy against themselves. A combination of these simple mechanisms is successful in stabilizing cooperation, even in a fully decentralized version of the problem where agents learn to use and assign reputations simultaneously. We show how our results relate to the literature in Evolutionary Game Theory, and discuss implications for artificial, human and hybrid systems, where reputations can be used as a way to establish trust and cooperation.Comment: Published in AAMAS'21, 9 page

    An efficient and versatile approach to trust and reputation using hierarchical Bayesian modelling

    No full text
    In many dynamic open systems, autonomous agents must interact with one another to achieve their goals. Such agents may be self-interested and, when trusted to perform an action, may betray that trust by not performing the action as required. Due to the scale and dynamism of these systems, agents will often need to interact with other agents with which they have little or no past experience. Each agent must therefore be capable of assessing and identifying reliable interaction partners, even if it has no personal experience with them. To this end, we present HABIT, a Hierarchical And Bayesian Inferred Trust model for assessing how much an agent should trust its peers based on direct and third party information. This model is robust in environments in which third party information is malicious, noisy, or otherwise inaccurate. Although existing approaches claim to achieve this, most rely on heuristics with little theoretical foundation. In contrast, HABIT is based exclusively on principled statistical techniques: it can cope with multiple discrete or continuous aspects of trustee behaviour; it does not restrict agents to using a single shared representation of behaviour; it can improve assessment by using any observed correlation between the behaviour of similar trustees or information sources; and it provides a pragmatic solution to the whitewasher problem (in which unreliable agents assume a new identity to avoid bad reputation). In this paper, we describe the theoretical aspects of HABIT, and present experimental results that demonstrate its ability to predict agent behaviour in both a simulated environment, and one based on data from a real-world webserver domain. In particular, these experiments show that HABIT can predict trustee performance based on multiple representations of behaviour, and is up to twice as accurate as BLADE, an existing state-of-the-art trust model that is both statistically principled and has been previously shown to outperform a number of other probabilistic trust models

    Congrats: a Configurable Granular Trust Scheme for Effective Seller Selection in an E-marketplace

    Get PDF
    Problem. The e-marketplace of today, with millions of buyers and sellers who never get to meet face to face, is susceptible to the presence of dishonest and fraudulent participants, prowling on unsuspecting trading partners to cheat in transactions, thereby increasing their profit to the detriment of their victims. There is also the multiplicity of goods and services with varying prices and quality, offered by a mix of honest and dishonest vendors. In order to participate in trade without incurring substantial loss, participants rely on intelligent agents using a trust evaluation scheme for partner selection. Making good deals thus depends on the ability of the intelligent agents to evaluate trading partners and picking only trustworthy ones. However, the existing trust evaluation schemes do not adequately protect buyers in the e-marketplace; hence, this study focused on designing a new trust evaluation scheme for buyer agents to use to effectively select sellers. -- Method. To increase the overall performance of intelligent agents and to limit loss for buyers in an e-marketplace, I propose CONGRATSā€”a configurable granular trust estimation scheme for effective seller selection. The proposed model used historical feedback ratings from multiple sources to estimate trust along multiple dimensions. I simulated a mini e-marketplace to generate the data needed for performance evaluation of the proposed model alongside two existing trust estimation schemesā€”FIRE and MDT. -- Results. At the peak of performance of CONGRATS, T1 sellers with the highest trust level accounted for about 45% of the total sales as against less than 10% recorded by the least trustworthy (T5) sellers. Compared to FIRE and MDT, CONGRATS had a performance gain of 15% and 30%, respectively, as well as an average earning of 0.89 (out of 1.0) per transaction in contrast to 0.70 and 0.62 per transaction respectively. Cumulative utility gain among buyer groups stood at 612.35 as contrasted to 518.96 and 421.28 for the FIRE and MDT models respectively. -- Conclusions. Modeling trust along multiple dimensions and gathering trust information from many different sources can significantly enhance the trust estimation scheme used by intelligent agents in an e-marketplace. This means that more transactions will occur between buyers and sellers that are more trustworthy. Inarguably, this will reduce loss to an infinitesimal level and consequently boost buyer confidenc
    corecore