91,234 research outputs found
Inferences from prior-based loss functions
Inferences that arise from loss functions determined by the prior are
considered and it is shown that these lead to limiting Bayes rules that are
closely connected with likelihood. The procedures obtained via these loss
functions are invariant under reparameterizations and are Bayesian unbiased or
limits of Bayesian unbiased inferences. These inferences serve as
well-supported alternatives to MAP-based inferences
Evidence for surprise minimization over value maximization in choice behavior
Classical economic models are predicated on the idea that the ultimate aim of choice is to maximize utility or reward. In contrast, an alternative perspective highlights the fact that adaptive behavior requires agents' to model their environment and minimize surprise about the states they frequent. We propose that choice behavior can be more accurately accounted for by surprise minimization compared to reward or utility maximization alone. Minimizing surprise makes a prediction at variance with expected utility models; namely, that in addition to attaining valuable states, agents attempt to maximize the entropy over outcomes and thus 'keep their options open'. We tested this prediction using a simple binary choice paradigm and show that human decision-making is better explained by surprise minimization compared to utility maximization. Furthermore, we replicated this entropy-seeking behavior in a control task with no explicit utilities. These findings highlight a limitation of purely economic motivations in explaining choice behavior and instead emphasize the importance of belief-based motivations
VIME: Variational Information Maximizing Exploration
Scalable and effective exploration remains a key challenge in reinforcement
learning (RL). While there are methods with optimality guarantees in the
setting of discrete state and action spaces, these methods cannot be applied in
high-dimensional deep RL scenarios. As such, most contemporary RL relies on
simple heuristics such as epsilon-greedy exploration or adding Gaussian noise
to the controls. This paper introduces Variational Information Maximizing
Exploration (VIME), an exploration strategy based on maximization of
information gain about the agent's belief of environment dynamics. We propose a
practical implementation, using variational inference in Bayesian neural
networks which efficiently handles continuous state and action spaces. VIME
modifies the MDP reward function, and can be applied with several different
underlying RL algorithms. We demonstrate that VIME achieves significantly
better performance compared to heuristic exploration methods across a variety
of continuous control tasks and algorithms, including tasks with very sparse
rewards.Comment: Published in Advances in Neural Information Processing Systems 29
(NIPS), pages 1109-111
The Dopaminergic Midbrain Encodes the Expected Certainty about Desired Outcomes
Dopamine plays a key role in learning; however, its exact function in decision making and choice remains unclear. Recently, we proposed a generic model based on active (Bayesian) inference wherein dopamine encodes the precision of beliefs about optimal policies. Put simply, dopamine discharges reflect the confidence that a chosen policy will lead to desired outcomes. We designed a novel task to test this hypothesis, where subjects played a "limited offer" game in a functional magnetic resonance imaging experiment. Subjects had to decide how long to wait for a high offer before accepting a low offer, with the risk of losing everything if they waited too long. Bayesian model comparison showed that behavior strongly supported active inference, based on surprise minimization, over classical utility maximization schemes. Furthermore, midbrain activity, encompassing dopamine projection neurons, was accurately predicted by trial-by-trial variations in model-based estimates of precision. Our findings demonstrate that human subjects infer both optimal policies and the precision of those inferences, and thus support the notion that humans perform hierarchical probabilistic Bayesian inference. In other words, subjects have to infer both what they should do as well as how confident they are in their choices, where confidence may be encoded by dopaminergic firing
Shackle versus Savage: non-probabilistic alternatives to subjective probability theory in the 1950s
G.L.S Shackle’s rejection of the probability tradition stemming from Knight's definition of uncertainty was a crucial episode in the development of modern decision theory. A set of methodological statements characterizing Shackle’s stance, abandoned for long, especially after Savage’s Foundations, have been re-discovered and are at the basis of current non-expected utility theories, in particular of the non-additive probability approach to decision making. This paper examines the discussion between Shackle and his critics in the 1950s. Drawing on Shackle’s papers housed at Cambridge University Library as well as on printed matter, we show that some critics correctly understood two aspects of Shackle’s theory which are of the utmost importance in our view: the non-additive character of the theory and the possibility of interpreting Shackle’s ascendancy functions as a specific distortion of the weighting function of the decision maker. It is argued that Shackle neither completely understood criticisms nor appropriately developed suggestions put forward by scholars like Kenneth Arrow, Ward Edwards, Nicholas Georgescu- Roegen. Had he succeeded in doing so, we contend, his theory might have been a more satisfactory alternative to Savage’s theory than it actually was.uncertainty, decision theory, non-additive measures
What Is A Number? Re-Thinking Derrida's Concept of Infinity
Iterability, the repetition which alters the idealization it reproduces, is the engine of deconstructive movement. The fact that all experience is transformative-dissimulative in its essence does not, however, mean that the momentum of change is the same for all situations. Derrida adapts Husserl's distinction between a bound and a free ideality to draw up a contrast between mechanical mathematical
calculation, whose in-principle infinite enumerability is supposedly meaningless, empty
of content, and therefore not in itself subject to alteration through contextual change, and idealities such as spoken or written language which are directly animated by a meaning-to-say and are thus immediately affected by context. Derrida associates the dangers of cultural stagnation, paralysis and irresponsibility with the emptiness of programmatic, mechanical, formulaic thinking. This paper endeavors to show that enumerative calculation is not context-independent in itself but is instead
immediately infused with alteration, thereby making incoherent Derrida's claim to distinguish between a free and bound ideality. Along with the presumed formal basis of numeric infinitization,
Derrida's non-dialectical distinction between forms of mechanical or programmatic thinking (the Same) and truly inventive experience (the absolute Other) loses its justification. In the place of a distinction between bound and free idealities is proposed a distinction between two poles of novelty; the first form of novel experience would be characterized by affectivites of unintelligibility ,
confusion and vacuity, and the second by affectivities of anticipatory continuity and intimacy
- …