65 research outputs found

    Little Information, Efficiency, and Learning - An Experimental Study

    Get PDF
    Earlier experiments have shown that under little information subjects are hardly able to coordinate even though there are no conflicting interests and subjects are organised in fixed pairs. This is so, even though a simple adjustment process would lead the subjects into the efficient, fair and individually payoff maximising outcome. We draw on this finding and design an experiment in which subjects re-peatedly play 4 simple games within 4 sets of 40 rounds under little information. This way we are able to investigate (i) the coordination abilities of the subjects depending on the underlying game, (ii) the resulting efficiency loss, and (iii) the adjustment of the learning rule.mutual fate control, matching pennies, fate-control behaviour- control, learning, coordination, little information

    Solving Two-Person Zero-Sum Stochastic Games With Incomplete Information Using Learning Automata With Artificial Barriers

    Get PDF
    Learning automata (LA) with artificially absorbing barriers was a completely new horizon of research in the 1980s (Oommen, 1986). These new machines yielded properties that were previously unknown. More recently, absorbing barriers have been introduced in continuous estimator algorithms so that the proofs could follow a martingale property, as opposed to monotonicity (Zhang et al., 2014), (Zhang et al., 2015). However, the applications of LA with artificial barriers are almost nonexistent. In that regard, this article is pioneering in that it provides effective and accurate solutions to an extremely complex application domain, namely that of solving two-person zero-sum stochastic games that are provided with incomplete information. LA have been previously used (Sastry et al., 1994) to design algorithms capable of converging to the game's Nash equilibrium under limited information. Those algorithms have focused on the case where the saddle point of the game exists in a pure strategy. However, the majority of the LA algorithms used for games are absorbing in the probability simplex space, and thus, they converge to an exclusive choice of a single action. These LA are thus unable to converge to other mixed Nash equilibria when the game possesses no saddle point for a pure strategy. The pioneering contribution of this article is that we propose an LA solution that is able to converge to an optimal mixed Nash equilibrium even though there may be no saddle point when a pure strategy is invoked. The scheme, being of the linear reward-inaction ( LR−IL_{R-I} ) paradigm, is in and of itself, absorbing. However, by incorporating artificial barriers, we prevent it from being ``stuck'' or getting absorbed in pure strategies. Unlike the linear reward-Δpenalty ( LR−ΔPL_{R-Δ P} ) scheme proposed by Lakshmivarahan and Narendra almost four decades ago, our new scheme achieves the same goal with much less parameter tuning and in a more elegant manner. This article includes the nontrial proofs of the theoretical results characterizing our scheme and also contains experimental verification that confirms our theoretical findings.acceptedVersio

    Learning in agent based models

    Get PDF
    This paper examines the process by which agents learn to act in economic environments. Learning is particularly complicated in such situations since the environment is, at least in part, made up of other agents who are also learning. At best, one can hope to obtain analytical results for a rudimentary model. To make progress in understanding the dynamics of learning and coordination in general cases one can simulate agent based models to see whether the results obtained in skeletal models translate into the more general case. Using this approach can help us to understand which are the crucial assumptions in determining whether learning converges and, if so, to which sort of state. Three examples are presented, one in which agents learn to form trading relationships, one in which agents misspecify the model of their environment and a last one in which agents may learn to take actions which are systematically favourable, (or unfavourable) for them. In each case simulating models in which agents operate with simple rules in a complex environment, allows us to examine the role of the type of learning process used by the agents the extent to which they coordinate on a final outcome and the nature of that outcome.Learning; agent based models; simulations; equilibria; asymmetric outcomes

    Adaptive decision processes

    Get PDF
    "September 27, 1962." "This report is based on a thesis submitted to the Department of Electrical Engineering, M.I.T., January 9, 1961 ... "Bibliography: p. 81-82.Army Signal Corps Contract DA 36-039-sc-78108 Dept. of the Army Task 3-99-20-001 and Project 3-99-00-000. Army Signal Corps Contract DA-SIG-36-039-61-G14.Jack Lee Rosenfeld

    Reactions of human subjects in simple sequential situations

    Get PDF
    The general purpose of this study was to explore the reactions of Ss to situations requiring a series of similar decisions. This was done within the framework of a mathematical analysis of situations; the framework owed much to game theory formulations. Particular purposes of the study were to observe the behaviour of individual Ss in a probability learning experiment, and in simple 2x2 games against nature. The observations made were considered in the light of some current theoretical notions about human behaviour in such situations. In particular, the stimulus sampling theory of Estes and his colleagues, the view of man as a processor of information according to Bayes' theroem, and the more general computer simulation views of behaviour were all examined. In general, neither stimulus sampling nor Bayesian accounts fit with the observations. All of the Ss studied were University students. They react in a fairly lawful way. The reaction depends on the structure of the situation. Given some information, many Ss approach an appropriate reaction and some achieve it. Even with no information, some Ss approach an appropriate reaction. This seems to occur by the elimination of likely hypotheses about the situation and, finally, by the use of an elaborated set of rules paying attention to consecutive rewards or non-rewards. Observations were also made of the Ss' declared purposes and of their ability to recognise a sequence of binary events as a random one. Suggestions for further research were made

    Learning in agent based models

    Get PDF
    This paper examines the process by which agents learn to act in economic environments. Learning is particularly complicated in such situations since the environment is, at least in part, made up of other agents who are also learning. At best, one can hope to obtain analytical results for a rudimentary model. To make progress in understanding the dynamics of learning and coordination in general cases one can simulate agent based models to see whether the results obtained in skeletal models translate into the more general case. Using this approach can help us to understand which are the crucial assumptions in determining whether learning converges and, if so, to which sort of state. Three examples are presented, one in which agents learn to form trading relationships, one in which agents misspecify the model of their environment and a last one in which agents may learn to take actions which are systematically favourable, (or unfavourable) for them. In each case simulating models in which agents operate with simple rules in a complex environment, allows us to examine the role of the type of learning process used by the agents the extent to which they coordinate on a final outcome and the nature of that outcome

    The matching law and melioration learning: From individual decision-making to social interactions

    Get PDF
    Das Thema dieser Dissertation ist die Anwendung des „Matching Law” als Verhaltensannahme bei der ErklĂ€rung sozialer PhĂ€nomene. Das „Matching Law” ist ein Modell der behavioristischen Lerntheorie und sagt aus, dass die relative HĂ€ufigkeit der Wahl einer Handlung mit der relativen HĂ€ufigkeit der Belohnung dieser Handlung ĂŒbereinstimmt. In der Dissertation werden verschiedene Probleme in Bezug auf die soziologische Anwendung des „Matching Law” erörtert. Aufbauend auf diesen Erkenntnissen wird das Entsprechungsgesetz in die ökonomische Entscheidungstheorie integriert und mit bestehenden Verhaltensprognosen theoretisch verglichen. Anschließend wird das Entsprechungsgesetz auf mehrere soziale Situationen angewandt. Dabei kommt ein Lernmodell zum Einsatz, welches als „Melioration Learning” bezeichnet wird und unter bestimmten Bedingungen zum Entsprechungsgesetz fĂŒhrt. Mit Hilfe dieses Lernmodells und agentenbasierter Simulationen werden Hypothesen zu sozialem Verhalten hergeleitet. ZunĂ€chst werden einfache Situationen mit nur zwei interagierenden Akteuren betrachtet. Dabei lassen sich durch das Entsprechungsgesetz einige Lösungskonzepte der Spieltheorie replizieren, obwohl weniger Annahmen bezĂŒglich der kognitiven FĂ€higkeiten der Akteure und der verfĂŒgbaren Informationen gesetzt werden. Außerdem werden Interaktionen zwischen beliebig vielen Akteuren untersucht. Erstens lĂ€sst sich die Entstehung sozialer Konventionen ĂŒber das Entsprechungsgesetz erklĂ€ren. Zweitens wird dargestellt, dass die Akteure lernen, in einem Freiwilligendilemma oder einem Mehrpersonen-Gefangenendilemma zu kooperieren

    Progress in the producer-scrounger game : information use and spatial models

    Get PDF
    Les animaux grĂ©gaires en quĂȘte de ressources peuvent soit consacrer leurs efforts Ă  la recherche (stratĂ©gie producteur) ou soit attendre que les producteurs rĂ©ussissent Ă  trouver ces ressources pour les y rejoindre (stratĂ©gie chapardeur). La profitabilitĂ© de chaque option peut ĂȘtre analysĂ©e par le jeu producteur-chapardeur. Ce jeu a Ă©tĂ© largement explorĂ© aux plans thĂ©orique et empirique, mais plusieurs aspects demeurent toujours inexplorĂ©s. J'ai dĂ©veloppĂ© cinq modĂšles afin d'explorer l'approvisionnement social en lien avec l'utilisation d'information et les contraintes spatiales. Le premier modĂšle concerne l'Ă©volution de rĂšgles d'apprentissage, des expressions mathĂ©matiques dĂ©crivant la valeur qu'un animal accorde aux options producteur et chapardeur en fonction des gains obtenus. J'ai dĂ©montrĂ© que la rĂšgle du relative pay-off sum est Ă©volutivement stable et donc la meilleure disponible. Les paramĂštres de la rĂšgle attendue demeurent intrigants et demandent maintenant Ă  ĂȘtre Ă©plorĂ©s au niveau empirique. Le second modĂšle explorĂ©s plutĂŽt l'effet de l'usage d'information sociale (chapardeur) chez un prĂ©dateur en examinant son effet sur l'Ă©volution du niveau d'agrĂ©gation de ses proies. Le modĂšle dĂ©montre que les proies Ă©voluent Ă  diffĂ©rents niveaux d'agrĂ©gation en rĂ©ponse Ă  l'usage d'information sociale par leurs prĂ©dateurs et que cette relation affecte Ă  la fois l'efficacitĂ© de recherche du prĂ©dateur et la survie des proies. Le troisiĂšme modĂšle teste l'hypothĂšse, gĂ©nĂ©rĂ©e Ă  partir de recherche empirique sur les oies cendrĂ©es, selon laquelle la variation du niveau de hardiesse serait associĂ©e Ă  un dimorphisme de producteurs hardis et de chapardeurs poltrons (bold et shy, respectivement) dans le jeu producteur-chapardeur. Le modĂšle rĂ©fute l'existence d'un tel dimorphisme, mais dĂ©montre nĂ©anmoins un effet environnemental fort des paramĂštres de l'approvisionnement social sur le niveau de hardiesse d'une population. Ce rĂ©sultat a d'importantes implications pour le rĂŽle de l'utilisation d'information et les effets spatiaux dans la rĂ©gulation des relations entre les producteurs et les chapardeurs. J'ai dĂ©veloppĂ© Ă  partir d'une approche d'automate cellulaire un modĂšle producteur-chapardeur pour dĂ©terminer si une rĂšgle simple (rule of thumb) fondĂ©e sur l'apprentissage social Ă©lĂ©mentaire dans un contexte spatialement explicite pouvait prĂ©dire l'atteinte d'un Ă©quilibre producteur-chapardeur. Les rĂ©sultats dĂ©montrent que l'ajout de cette rĂšgle simple gĂ©nĂšre Ă  la fois une flexibilitĂ© comportementale significative et des dynamiques complexes qui ne sont pas habituelles Ă  ce genre de systĂšmes simples. Le modĂšle lie l'usage d'information sociale Ă  la structure spatiale dans un modĂšle dĂ©terministe. Enfin, avec le cinquiĂšme modĂšle j'ai explorĂ© les effets de la gĂ©omĂ©trie du paysage (la façon dont l'espace est reprĂ©sentĂ©, habituellement un quadrillage rĂ©gulier) sur le jeu producteur-chapardeur. Il appert que les reprĂ©sentations spatiales sont un dĂ©terminant-clĂ© dans la maniĂšre dont un jeu d'approvisionnement social d'alimentation peut rĂ©ellement rendre compte de l'approvisionnement des animaux. \ud ______________________________________________________________________________ \ud MOTS-CLÉS DE L’AUTEUR : l'approvisionnement social, effets spatiaux, l'utilisation des informations, l'apprentissage, personnalitĂ©s des animau
    • 

    corecore