6,003 research outputs found
Strategic Communication Between Prospect Theoretic Agents over a Gaussian Test Channel
In this paper, we model a Stackelberg game in a simple Gaussian test channel
where a human transmitter (leader) communicates a source message to a human
receiver (follower). We model human decision making using prospect theory
models proposed for continuous decision spaces. Assuming that the value
function is the squared distortion at both the transmitter and the receiver, we
analyze the effects of the weight functions at both the transmitter and the
receiver on optimal communication strategies, namely encoding at the
transmitter and decoding at the receiver, in the Stackelberg sense. We show
that the optimal strategies for the behavioral agents in the Stackelberg sense
are identical to those designed for unbiased agents. At the same time, we also
show that the prospect-theoretic distortions at both the transmitter and the
receiver are both larger than the expected distortion, thus making behavioral
agents less contended than unbiased agents. Consequently, the presence of
cognitive biases increases the need for transmission power in order to achieve
a given distortion at both transmitter and receiver.Comment: 6 pages, 3 figures, Accepted to MILCOM-2017, Corrections made in the
new versio
von Neumann-Morgenstern and Savage Theorems for Causal Decision Making
Causal thinking and decision making under uncertainty are fundamental aspects
of intelligent reasoning. Decision making under uncertainty has been well
studied when information is considered at the associative (probabilistic)
level. The classical Theorems of von Neumann-Morgenstern and Savage provide a
formal criterion for rational choice using purely associative information.
Causal inference often yields uncertainty about the exact causal structure, so
we consider what kinds of decisions are possible in those conditions. In this
work, we consider decision problems in which available actions and consequences
are causally connected. After recalling a previous causal decision making
result, which relies on a known causal model, we consider the case in which the
causal mechanism that controls some environment is unknown to a rational
decision maker. In this setting we state and prove a causal version of Savage's
Theorem, which we then use to develop a notion of causal games with its
respective causal Nash equilibrium. These results highlight the importance of
causal models in decision making and the variety of potential applications.Comment: Submitted to Journal of Causal Inferenc
Preference purification and the inner rational agent:A critique of the conventional wisdom of behavioural welfare economics
Neoclassical economics assumes that individuals have stable and context-independent preferences, and uses preference-satisfaction as a normative criterion. By calling this assumption into question, behavioural findings cause fundamental problems for normative economics. A common response to these problems is to treat deviations from conventional rational-choice theory as mistakes, and to try to reconstruct the preferences that individuals would have acted on, had they reasoned correctly. We argue that this preference purification approach implicitly uses a dualistic model of the human being, in which an inner rational agent is trapped in an outer psychological shell. This model is psychologically and philosophically problematic
Bayesian Decision Theory and Stochastic Independence
Stochastic independence has a complex status in probability theory. It is not part of the definition of a probability measure, but it is nonetheless an essential property for the mathematical development of this theory. Bayesian decision theorists such as Savage can be criticized for being silent about stochastic independence. From their current preference axioms, they can derive no more than the definitional properties of a probability measure. In a new framework of twofold uncertainty, we introduce preference axioms that entail not only these definitional properties, but also the stochastic independence of the two sources of uncertainty. This goes some way towards filling a curious lacuna in Bayesian decision theory
Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach
Reinforcement learning (RL) agents have traditionally been tasked with
maximizing the value function of a Markov decision process (MDP), either in
continuous settings, with fixed discount factor , or in episodic
settings, with . While this has proven effective for specific tasks
with well-defined objectives (e.g., games), it has never been established that
fixed discounting is suitable for general purpose use (e.g., as a model of
human preferences). This paper characterizes rationality in sequential decision
making using a set of seven axioms and arrives at a form of discounting that
generalizes traditional fixed discounting. In particular, our framework admits
a state-action dependent "discount" factor that is not constrained to be less
than 1, so long as there is eventual long run discounting. Although this
broadens the range of possible preference structures in continuous settings, we
show that there exists a unique "optimizing MDP" with fixed whose
optimal value function matches the true utility of the optimal policy, and we
quantify the difference between value and utility for suboptimal policies. Our
work can be seen as providing a normative justification for (a slight
generalization of) Martha White's RL task formalism (2017) and other recent
departures from the traditional RL, and is relevant to task specification in
RL, inverse RL and preference-based RL.Comment: 8 pages + 1 page supplement. In proceedings of AAAI 2019. Slides,
poster and bibtex available at
https://silviupitis.com/#rethinking-the-discount-factor-in-reinforcement-learning-a-decision-theoretic-approac
Bounded Risk-Sensitive Markov Games: Forward Policy Design and Inverse Reward Learning with Iterative Reasoning and Cumulative Prospect Theory
Classical game-theoretic approaches for multi-agent systems in both the
forward policy design problem and the inverse reward learning problem often
make strong rationality assumptions: agents perfectly maximize expected
utilities under uncertainties. Such assumptions, however, substantially
mismatch with observed humans' behaviors such as satisficing with sub-optimal,
risk-seeking, and loss-aversion decisions. In this paper, we investigate the
problem of bounded risk-sensitive Markov Game (BRSMG) and its inverse reward
learning problem for modeling human realistic behaviors and learning human
behavioral models. Drawing on iterative reasoning models and cumulative
prospect theory, we embrace that humans have bounded intelligence and maximize
risk-sensitive utilities in BRSMGs. Convergence analysis for both the forward
policy design and the inverse reward learning problems are established under
the BRSMG framework. We validate the proposed forward policy design and inverse
reward learning algorithms in a navigation scenario. The results show that the
behaviors of agents demonstrate both risk-averse and risk-seeking
characteristics. Moreover, in the inverse reward learning task, the proposed
bounded risk-sensitive inverse learning algorithm outperforms a baseline
risk-neutral inverse learning algorithm by effectively recovering not only more
accurate reward values but also the intelligence levels and the risk-measure
parameters given demonstrations of agents' interactive behaviors.Comment: Accepted by 2021 AAAI Conference on Artificial Intelligenc
- …