12 research outputs found
Predicting Human Cooperation
The Prisoner's Dilemma has been a subject of extensive research due to its
importance in understanding the ever-present tension between individual
self-interest and social benefit. A strictly dominant strategy in a Prisoner's
Dilemma (defection), when played by both players, is mutually harmful.
Repetition of the Prisoner's Dilemma can give rise to cooperation as an
equilibrium, but defection is as well, and this ambiguity is difficult to
resolve. The numerous behavioral experiments investigating the Prisoner's
Dilemma highlight that players often cooperate, but the level of cooperation
varies significantly with the specifics of the experimental predicament. We
present the first computational model of human behavior in repeated Prisoner's
Dilemma games that unifies the diversity of experimental observations in a
systematic and quantitatively reliable manner. Our model relies on data we
integrated from many experiments, comprising 168,386 individual decisions. The
computational model is composed of two pieces: the first predicts the
first-period action using solely the structural game parameters, while the
second predicts dynamic actions using both game parameters and history of play.
Our model is extremely successful not merely at fitting the data, but in
predicting behavior at multiple scales in experimental designs not used for
calibration, using only information about the game structure. We demonstrate
the power of our approach through a simulation analysis revealing how to best
promote human cooperation.Comment: Added references. New inline citation style. Added small portions of
text. Re-compiled Rmarkdown file with updated ggplot2 so small aesthetic
changes to plot
Predicting human cooperation in the Prisoner’s Dilemma using case-based decision theory
In this paper, we show that Case-based decision theory, proposed by Gilboa and Schmeidler (Q J Econ 110(3):605–639, 1995), can explain the aggregate dynamics of cooperation in the repeated Prisoner’s Dilemma, as observed in the experiments performed by Camera and Casari (Am Econ Rev 99:979–1005, 2009). Moreover, we find CBDT provides a better fit to the dynamics of cooperation than does the existing Probit model, which is the first time such a result has been found. We also find that humans aspire to a payoff above the mutual defection outcome but below the mutual cooperation outcome, which suggests they hope, but are not confident, that cooperation can be achieved. Finally, our best-fitting parameters suggest that circumstances with more details are easier to recall. We make a prediction for future experiments: if the repeated PD were run for more periods, then we would be begin to see an increase in cooperation, most dramatically in the second treatment, where history is observed but identities are not. This is the first application of Case-based decision theory to a strategic context and the first empirical test of CBDT in such a context. It is also the first application of bootstrapped standard errors to an agent-based model
Predicting human decision making in psychological tasks with recurrent neural networks
Unlike traditional time series, the action sequences of human decision making
usually involve many cognitive processes such as beliefs, desires, intentions
and theory of mind, i.e. what others are thinking. This makes predicting human
decision making challenging to be treated agnostically to the underlying
psychological mechanisms. We propose to use a recurrent neural network
architecture based on long short-term memory networks (LSTM) to predict the
time series of the actions taken by the human subjects at each step of their
decision making, the first application of such methods in this research domain.
In this study, we collate the human data from 8 published literature of the
Iterated Prisoner's Dilemma comprising 168,386 individual decisions and
postprocess them into 8,257 behavioral trajectories of 9 actions each for both
players. Similarly, we collate 617 trajectories of 95 actions from 10 different
published studies of Iowa Gambling Task experiments with healthy human
subjects. We train our prediction networks on the behavioral data from these
published psychological experiments of human decision making, and demonstrate a
clear advantage over the state-of-the-art methods in predicting human decision
making trajectories in both single-agent scenarios such as the Iowa Gambling
Task and multi-agent scenarios such as the Iterated Prisoner's Dilemma. In the
prediction, we observe that the weights of the top performers tends to have a
wider distribution, and a bigger bias in the LSTM networks, which suggests
possible interpretations for the distribution of strategies adopted by each
group
Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior
Prisoner's Dilemma mainly treat the choice to cooperate or defect as an
atomic action. We propose to study online learning algorithm behavior in the
Iterated Prisoner's Dilemma (IPD) game, where we explored the full spectrum of
reinforcement learning agents: multi-armed bandits, contextual bandits and
reinforcement learning. We have evaluate them based on a tournament of iterated
prisoner's dilemma where multiple agents can compete in a sequential fashion.
This allows us to analyze the dynamics of policies learned by multiple
self-interested independent reward-driven agents, and also allows us study the
capacity of these algorithms to fit the human behaviors. Results suggest that
considering the current situation to make decision is the worst in this kind of
social dilemma game. Multiples discoveries on online learning behaviors and
clinical validations are stated.Comment: To the best of our knowledge, this is the first attempt to explore
the full spectrum of reinforcement learning agents (multi-armed bandits,
contextual bandits and reinforcement learning) in the sequential social
dilemma. This mental variants section supersedes and extends our work
arXiv:1706.02897 (MAB), arXiv:2005.04544 (CB) and arXiv:1906.11286 (RL) into
the multi-agent settin
Correlation neglect and case-based decisions
In most theories of choice under uncertainty, decision-makers are assumed to evaluate acts in terms of subjective values attributed to consequences and probabilities assigned to events. Case-based decision theory (CBDT), proposed by Gilboa and Schmeidler, is fundamentally different, and in the tradition of reinforcement learning models. It has no state space and no concept of probability. An agent evaluates each available act in terms of the consequences he has experienced through choosing that act in previous decision problems that he perceives to be similar to his current problem. Gilboa and Schmeidler present CBDT as a complement to expected utility theory (EUT), applicable only when the state space is unknown. Accordingly, most experimental tests of CBDT have used problems for which EUT makes no predictions. In contrast, we test the conjecture that case-based reasoning may also be used when relevant probabilities can be derived by Bayesian inference from observations of random processes, and that such reasoning may induce violations of EUT. Our experiment elicits participants’ valuations of a lottery after observing realisations of the lottery being valued and realisations of another lottery. Depending on the treatment, participants know that the payoffs from the two lotteries are independent, positively correlated, or negatively correlated. We find no evidence of correlation neglect indicative of case-based reasoning. However, in the negative correla- tion treatment, valuations cannot be explained by Bayesian reasoning, while stated qualitative judgements about chances of winning can
Case-based Reasoning and Dynamic Choice Modeling
Estimating discrete choices under uncertainty typically rely on assumptions of expected utility theory. We build on the dynamic choice modeling literature by using a nonlinear case-based reasoning approach based on cognitive processes and forms expectations by comparing the similarity between past problems and the current problem faced by a decision maker. This study provides a proof of concept of a behavioral model of location choice applied to recreational fishers’ location choice behavior in Connecticut. We find the case-based decision model does well in explaining the observed data and provides value in explaining the dynamic value of attributes
Estimating Case-Based Learning
We propose a framework in order to econometrically estimate case-based learning and apply it to empirical data from twelve 2 × 2 mixed strategy equilibria experiments. Case-based learning allows agents to explicitly incorporate information available to the experimental subjects in a simple, compact, and arguably natural way. We compare the estimates of case-based learning to other learning models (reinforcement learning and self-tuned experience weighted attraction learning) while using in-sample and out-of-sample measures. We find evidence that case-based learning explains these data better than the other models based on both in-sample and out-of-sample measures. Additionally, the case-based specification estimates how factors determine the salience of past experiences for the agents. We find that, in constant sum games, opposing players’ behavior is more important than recency and, in non-constant sum games, the reverse is true
Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans
We are currently unable to specify human goals and societal values in a way
that reliably directs AI behavior. Law-making and legal interpretation form a
computational engine that converts opaque human values into legible directives.
"Law Informs Code" is the research agenda embedding legal knowledge and
reasoning in AI. Similar to how parties to a legal contract cannot foresee
every potential contingency of their future relationship, and legislators
cannot predict all the circumstances under which their proposed bills will be
applied, we cannot ex ante specify rules that provably direct good AI behavior.
Legal theory and practice have developed arrays of tools to address these
specification problems. For instance, legal standards allow humans to develop
shared understandings and adapt them to novel situations. In contrast to more
prosaic uses of the law (e.g., as a deterrent of bad behavior through the
threat of sanction), leveraged as an expression of how humans communicate their
goals, and what society values, Law Informs Code.
We describe how data generated by legal processes (methods of law-making,
statutory interpretation, contract drafting, applications of legal standards,
legal reasoning, etc.) can facilitate the robust specification of inherently
vague human goals. This increases human-AI alignment and the local usefulness
of AI. Toward society-AI alignment, we present a framework for understanding
law as the applied philosophy of multi-agent alignment. Although law is partly
a reflection of historically contingent political power - and thus not a
perfect aggregation of citizen preferences - if properly parsed, its
distillation offers the most legitimate computational comprehension of societal
values available. If law eventually informs powerful AI, engaging in the
deliberative political process to improve law takes on even more meaning.Comment: Forthcoming in Northwestern Journal of Technology and Intellectual
Property, Volume 2
Interactions in Information Spread
Since the development of writing 5000 years ago, human-generated data gets
produced at an ever-increasing pace. Classical archival methods aimed at easing
information retrieval. Nowadays, archiving is not enough anymore. The amount of
data that gets generated daily is beyond human comprehension, and appeals for
new information retrieval strategies. Instead of referencing every single data
piece as in traditional archival techniques, a more relevant approach consists
in understanding the overall ideas conveyed in data flows. To spot such general
tendencies, a precise comprehension of the underlying data generation
mechanisms is required. In the rich literature tackling this problem, the
question of information interaction remains nearly unexplored. First, we
investigate the frequency of such interactions. Building on recent advances
made in Stochastic Block Modelling, we explore the role of interactions in
several social networks. We find that interactions are rare in these datasets.
Then, we wonder how interactions evolve over time. Earlier data pieces should
not have an everlasting influence on ulterior data generation mechanisms. We
model this using dynamic network inference advances. We conclude that
interactions are brief. Finally, we design a framework that jointly models rare
and brief interactions based on Dirichlet-Hawkes Processes. We argue that this
new class of models fits brief and sparse interaction modelling. We conduct a
large-scale application on Reddit and find that interactions play a minor role
in this dataset. From a broader perspective, our work results in a collection
of highly flexible models and in a rethinking of core concepts of machine
learning. Consequently, we open a range of novel perspectives both in terms of
real-world applications and in terms of technical contributions to machine
learning.Comment: PhD thesis defended on 2022/09/1