1,665 research outputs found
Efficient interval scoring rules
Scoring rules that elicit an entire belief distribution through the elicitation of point beliefs are time-consuming and demand considerable cognitive e¤ort. Moreover, the results are valid only when agents are risk-neutral or when one uses probabilistic rules. We investigate a class of rules in which the agent has to choose an interval and is rewarded (deterministically) on the basis of the chosen interval and the realization of the random variable. We formulate an e¢ ciency criterion for such rules and present a speci.c interval scoring rule. For single- peaked beliefs, our rule gives information about both the location and the dispersion of the belief distribution. These results hold for all concave utility functions.Belief elicitation, scoring rules, subjective probabilities
Efficient POMDP Forward Search by Predicting the Posterior Belief Distribution
Online, forward-search techniques have demonstrated promising results for solving problems in partially observable environments. These techniques depend on the ability to efficiently search and evaluate the set of beliefs reachable from the current belief. However, enumerating or sampling action-observation sequences to compute the reachable beliefs is computationally demanding; coupled with the need to satisfy real-time constraints, existing online solvers can only search to a limited depth. In this paper, we propose that policies can be generated directly from the distribution of the agent's posterior belief. When the underlying state distribution is Gaussian, and the observation function is an exponential family distribution, we can calculate this distribution of beliefs without enumerating the possible observations. This property not only enables us to plan in problems with large observation spaces, but also allows us to search deeper by considering policies composed of multi-step action sequences. We present the Posterior Belief Distribution (PBD) algorithm, an efficient forward-search POMDP planner for continuous domains, demonstrating that better policies are generated when we can perform deeper forward search
Free Entropy Minimizing Persuasion in a Predictor-Corrector Dynamic
Persuasion is the process of changing an agent's belief distribution from a
given (or estimated) prior to a desired posterior. A common assumption in the
acceptance of information or misinformation as fact is that the
(mis)information must be consistent with or familiar to the individual who
accepts it. We model the process as a control problem in which the state is
given by a (time-varying) belief distribution following a predictor-corrector
dynamic. Persuasion is modeled as the corrector control signal with the
performance index defined using the Fisher-Rao information metric, reflecting a
fundamental cost associated to altering the agent's belief distribution. To
compensate for the fact that information production arises naturally from the
predictor dynamic (i.e., expected beliefs change) we modify the Fisher-Rao
metric to account just for information generated by the control signal. The
resulting optimal control problem produces non-geodesic paths through
distribution space that are compared to the geodesic paths found using the
standard free entropy minimizing Fisher metric in several example belief
models: a Kalman Filter, a Boltzmann distribution and a joint Kalman/Boltzmann
belief system.Comment: 16 pages, 7 figure
Adaptive social learning
This paper investigates the learning foundations of economic models of social learning. We pursue the prevalent idea in economics that rational play is the outcome of a dynamic process of adaptation. Our learning approach offers us the possibility to clarify when and why the prevalent rational (equilibrium) view of social learning is likely to capture observed regularities in the field. In particular it enables us to address the issue of individual and interactive knowledge. We argue that knowledge about the private belief distribution is unlikely to be shared in most social learning contexts. Absent this mutual knowledge, we show that the long-run outcome of the adaptive process favors non-Bayesian rational play.social Learning ; informational herding ; adaptation ; analogies ; non-Bayesian updating
- …