154 research outputs found

    Foresighted policy gradient reinforcement learning: solving large-scale social dilemmas with rational altruistic punishment

    Get PDF
    Many important and difficult problems can be modeled as “social dilemmas”, like Hardin's Tragedy of the Commons or the classic iterated Prisoner's Dilemma. It is well known that in these problems, it can be rational for self-interested agents to promote and sustain cooperation by altruistically dispensing costly punishment to other agents, thus maximizing their own long-term reward. However, self-interested agents using most current multi-agent reinforcement learning algorithms will not sustain cooperation in social dilemmas: the algorithms do not sufficiently capture the consequences on the agent's reward of the interactions that it has with other agents. Recent more foresighted algorithms specifically account for such expected consequences, and have been shown to work well for the small-scale Prisoner's Dilemma. However, this approach quickly becomes intractable for larger social dilemmas. Here, we advance on this work and develop a “teach/learn” stateless foresighted policy gradient reinforcement learning algorithm that applies to Social Dilemma's with negative, unilateral side-payments, in the from of costly punishment. In this setting, the algorithm allows agents to learn the most rewarding actions to take with respect to both the dilemma (Cooperate/Defect) and the “teaching” of other agent's behavior through the dispensing of punishment. Unlike other algorithms, we show that this approach scales well to large settings like the Tragedy of the Commons. We show for a variety of settings that large groups of self-interested agents using this algorithm will robustly find and sustain cooperation in social dilemmas where adaptive agents can punish the behavior of other similarly adaptive agents

    Social niche construction: evolutionary explanations for cooperative group formation

    No full text
    Cooperative behaviours can be defined as those that benefit others at an apparent cost to self. How these kinds of behaviours can evolve has been a topic of great interest in evolutionary biology, for at first sight we would not expect one organism to evolve to help another. Explanations for cooperation rely on the presence of a population structure that clusters cooperators together, such that they enjoy the benefits of each others' actions. But, the question that has been left largely unaddressed is, how does this structure itself evolve? If we want to really explain why organisms cooperate, then we need to explain not just their adaptation to their social environment, but why they live in that environment.It is well-known that individual genetic traits can affect population structure; an example is extracellular matrix production by bacteria in a biofilm. Yet, the concurrent evolution of such traits with social behaviour is very rarely considered. We show here that social behaviour can exert indirect selection pressure on population structure-modifying traits, causing individuals to adaptively modify their population structure to support greater cooperation. Moreover, we argue that any component of selection on structure modifying traits that is due to social behaviour must be in the direction of increased cooperation; that component of selection cannot be in favour of the conditions for greater selfishness. We then examine the conditions under which this component of selection on population structure exists. Thus, we argue that not only can population structure drive the evolution of cooperation, as in classical models, but that the benefits of greater cooperation can in turn drive the evolution of population structure - a positive feedback process that we call social niche construction.We argue that this process is necessary in providing an adaptive explanation for some of the major transitions in evolution (such as from single- to multi- celled organisms, and from solitary insects to eusocial colonies). Any satisfactory account of these transitions must explain how the individuals came to live in a population structure that supported high degrees of cooperation, as well as showing that cooperation is individually advantageous given that structure

    Measuring cooperation and other risks : Experimental evidence on individual differences

    Get PDF
    This master's thesis examines how the degree of risk influences the cooperative behavior of individuals to understand how cooperation can be better organized. Explicitly, the individual risk preference as compared to the risk level of the environment in which individuals make their decisions. The effect of a social frame on the classical representation of economic games is investigated experimentally. In general, the understanding and identification of critical success factors enable to indicate the cooperation behavior of individuals, but organizations also benefit from the indicated components that require cooperation. Cooperation’s are associated with a high level of risk and pressure. Therefore, it is necessary to carefully study the environmental conditions to get the best result. The aim of this study is to characterize the optimal conditions for the evolution of cooperation and its critical success factors to ensure the success of cooperation and to guarantee operational excellence of the entire process. This master's thesis is based on an experimental study that collects facts and evidence from different perspectives. This experimental study helps to understand the motives behind cooperation in the Stag-hunt games by comparing different economic gams and two risk preferences elicitation methods with those Stag-hunt games of this study. The Trust game, Ultimatum game, Dictator game, as well as the Holt and Laury price list and the Bomb risk elicitation task, are compared to the Stag-hunt games. Payoffs are manipulated in a two-player one-shot Stag-hunt game. The Payoffs explain the degree of cooperation by combining three motives: Baseline, more efficient, and less risk. In addition, the social framing effect is investigated as a treatment in the experiment. This is implemented as a joint venture scenario. This study helps organizations to better understand how to develop strategies that protect against failure of cooperation. Decision-makers can use the results of this research to carry out cooperation’s from planning, through implementation, to a successful conclusion. On the one hand, payoff dominance and risk dominance are not significant. However, in the game less risk there is a positive influence on the likelihood of cooperation. On the other hand, the treatment business setting is strongly significant which means that cooperation occurs more often in the joint venture scenario than in the classical representation of the economic games. It positively influences cooperation behavior. This appears to be why previous attempts to explain Stag-hunt games' decisions only with risk attitudes have not been successful. In this study, trust does not significantly influence. However, it could be demonstrated that it is a basic requirement for cooperation

    Out-of-Equilibrium Economics and Agent-Based Modeling

    Get PDF
    Standard neoclassical economics asks what agents' actions, strategies, or expectations are in equilibrium with (consistent with) the outcome or pattern these behaviors aggregatively create. Agent-based computational economics enables us to ask a wider question: how agents' actions, strategies, or expectations might react to- might endogenously change with- the patterns they create. In other words, it enables us to examine how the economy behaves out of equilibrium, when it is not at a steady state. This out-of-equilibrium approach is not a major adjunct to standard economic theory; it is economics done in a more general way. When examined out of equilibrium, economic patterns sometimes simplify into a simple, homogeneous equilibrium of standard economics; but just as often they show perpetually novel and complex behavior. The static equilibrium approach suffers two characteristic indeterminancies: it cannot easily resolve among multiple equilibria; nor can it easily model individuals' choices of expectations. Both problems are ones of formation (of an equilibrium and of an "ecology" of expectations, respectively), and when analyzed in formation - that is, out of equilibrium - these anomalies disappear

    When Nice Guys Finish First: The Evolution of Cooperation, The Study of Law, and the Ordering of Legal Regimes

    Get PDF
    This Note adds to the scholarship in the area of Evolutionary Analysis and the Law (EA). EA is a paradigm that comments on the implications of evolution on the law. EA recognizes that many complex human behaviors that the law seeks to regulate have evolutionary origins that remain relevant today. This Note details how an understanding of the evolutionary basis of cooperation can bring about favorable revisions and reforms in the law. Following a review of the scientific foundation of EA, this Note sets forth the proposition that humans have an evolutionarily developed tendency to cooperate, an idea that contrasts the widely held belief that the evolutionary man is purely self-interested. This Note does, however, observe that the tendency to cooperate is not expressed at all times. This Note then explores the implications of EA on other areas of legal scholarship, such as behavioral law and economics, default rules in partnership law, and efficient mechanisms of trade. This Note concludes by addressing the concerns of EA critics and mapping a path for the future of EA

    Signaling Discount Rates: Law, Norms, and Economic Methodology

    Get PDF

    Good decisions : reconciling human rationality, evolution, and ethics

    Get PDF
    Cover title.Includes bibliographical references (p. 29-35).by Steven F. Freeman

    TOWARDS A HOLISTIC RISK MODEL FOR SAFEGUARDING THE PHARMACEUTICAL SUPPLY CHAIN: CAPTURING THE HUMAN-INDUCED RISK TO DRUG QUALITY

    Get PDF
    Counterfeit, adulterated, and misbranded medicines in the pharmaceutical supply chain (PSC) are a critical problem. Regulators charged with safeguarding the supply chain are facing shrinking resources for inspections while concurrently facing increasing demands posed by new drug products being manufactured at more sites in the US and abroad. To mitigate risk, the University of Kentucky (UK) Central Pharmacy Drug Quality Study (DQS) tests injectable drugs dispensed within the UK hospital. Using FT-NIR spectrometry coupled with machine learning techniques the team identifies and flags potentially contaminated drugs for further testing and possible removal from the pharmacy. Teams like the DQS are always working with limited equipment, time, and staffing resources. Scanning every vial immediately before use is infeasible and drugs must be prioritized for analysis. A risk scoring system coupled with batch sampling techniques is currently used in the DQS. However, a risk scoring system only allows the team to know about the risks to the PSC today. It doesn’t let us predict what the risks will be in the future. To begin bridging this gap in predictive modeling capabilities the authors assert that models must incorporate the human element. A sister project to the DQS, the Drug Quality Game (DGC), enables humans and all of their unpredictability to be inserted into a virtual PSC. The DQG approach was adopted as a means of capturing human creativity, imagination, and problem-solving skills. Current methods of prioritizing drug scans rely heavily on drug cost, sole-source status, warning letters, equipment and material specifications. However, humans, not machines, commit fraud. Given that even one defective drug product could have catastrophic consequences this project will improve risk-based modeling by equipping future models to identify and incorporate human-induced risks, expanding the overall landscape of risk-based modeling. This exploratory study tested the following hypotheses (1) a useful game system able to simulate real-life humans and their actions in a pharmaceutical manufacturing process can be designed and deployed, (2) there are variables in the game that are predictive of human-induced risks to the PSC, and (3) the game can identify ways in which bad actors can “game the system” (GTS) to produce counterfeit, adulterated, and misbranded drugs. A commercial-off-the-shelf (COTS) game, BigPharma, was used as the basis of a game system able to simulate the human subjects and their actions in a pharmaceutical manufacturing process. BigPharma was selected as it provides a low-cost, time-efficient virtual environment that captures the major elements of a pharmaceutical business- research, marketing, and manufacturing/processing. Running Big Pharma with a Python shell enables researchers to implement specific GxP-related tasks (Good x Practice, where x=Manufacturing, Clinical, Research, etc.) not provided in the COTS BigPharma game. Results from players\u27 interaction with the Python shell/Big Pharma environment suggest that the game can identify both variables predictive of human-induced risks to the PSC and ways in which bad actors may GTS. For example, company profitability emerged as one variable predictive of successful GTS. Player\u27s unethical in-game techniques matched well with observations seen within the DQS

    Spatial competition of learning agents in agricultural procurement markets

    Get PDF
    Spatially dispersed farmers supply raw milk as the primary input to a small number of large dairy-processing firms. The spatial competition of processing firms has short- to long-term repercussions on farm and processor structure, as it determines the regional demand for raw milk and the resulting raw milk price. A number of recent analytical and empirical contributions in the literature analyse the spatial price competition of processing firms in milk markets. Agent-based models (ABMs) serve by now as computational laboratories in many social science and interdisciplinary fields and are recently also introduced as bottom-up approaches to help understand market outcomes emerging from autonomously deciding and interacting agents. Despite ABMs' strengths, the inclusion of interactive learning by intelligent agents is not sufficiently matured. Although the literature of multi-agent systems (MASs) and multi-agent economic simulation are related fields of research they have progressed along separate paths. This thesis takes us through some basic steps involved in developing a theoretical basis for designing multi-agent learning in spatial economic ABMs. Each of the three main chapters of the thesis investigates a core issue for designing interactive learning systems with the overarching aim of better understanding the emergence of pricing behaviour in real, spatial agricultural markets. An important problem in the competitive spatial economics literature is the lack of a rigorous theoretical explanation for observed collusive behavior in oligopsonistic markets. The first main chapter theoretically derives how the incorporation of foresight in agents' pricing policy in spatial markets might move the system towards cooperative Nash equilibria. It is shown that a basic level of foresight invites competing firms to cease limitless price wars. Introducing the concept of an outside option into the agents' decisions within a dynamic pricing game reveals viihow decreasing returns for increasing strategic thinking correlates with the relevance of transportation costs. In the second main chapter, we introduce a new learning algorithm for rational agents using H-PHC (hierarchical policy hill climbing) in spatial markets. While MASs algorithms are typically just applicable to small problems, we show experimentally how a community of multiple rational agents is able to overcome the coordination problem in a variety of spatial (and non-spatial) market games of rich decision spaces with modest computational effort. The theoretical explanation of emerging price equilibria in spatial markets is much disputed in the literature. The majority of papers attribute the pricing behavior of processing firms (mill price and freight absorption) merely to the spatial structure of markets. Based on a computational approach with interactive learning agents in two-dimensional space, the third main chapter suggests that associating the extent of freight absorption just with the factor space can be ambiguous. In addition, the pricing behavior of agricultural processors – namely the ability to coordinate and achieve mutually beneficial outcomes - also depends on their ability to learn from each other
    • …
    corecore