7,643 research outputs found
Lotteries and justification
The lottery paradox shows that the following three individually highly plausible theses are jointly incompatible: (i) highly probable propositions are justifiably believable, (ii) justified believability is closed under conjunction introduction, (iii) known contradictions are not justifiably believable. This paper argues that a satisfactory solution to the lottery paradox must reject (i) as versions of the paradox can be generated without appeal to either (ii) or (iii) and proposes a new solution to the paradox in terms of a novel account of justified believability
What else justification could be
According to a captivating picture, epistemic justification is essentially a matter of epistemic or evidential likelihood. While certain problems for this view are well known, it is motivated by a very natural thought—if justification can fall short of epistemic certainty, then what else could it possibly be? In this paper I shall develop an alternative way of thinking about epistemic justification. On this conception, the difference between justification and likelihood turns out to be akin to the more widely recognised difference between ceteris paribus laws and brute statistical generalisations. I go on to discuss, in light of this suggestion, issues such as classical and lottery-driven scepticism as well as the lottery and preface paradoxes
Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach
Reinforcement learning (RL) agents have traditionally been tasked with
maximizing the value function of a Markov decision process (MDP), either in
continuous settings, with fixed discount factor , or in episodic
settings, with . While this has proven effective for specific tasks
with well-defined objectives (e.g., games), it has never been established that
fixed discounting is suitable for general purpose use (e.g., as a model of
human preferences). This paper characterizes rationality in sequential decision
making using a set of seven axioms and arrives at a form of discounting that
generalizes traditional fixed discounting. In particular, our framework admits
a state-action dependent "discount" factor that is not constrained to be less
than 1, so long as there is eventual long run discounting. Although this
broadens the range of possible preference structures in continuous settings, we
show that there exists a unique "optimizing MDP" with fixed whose
optimal value function matches the true utility of the optimal policy, and we
quantify the difference between value and utility for suboptimal policies. Our
work can be seen as providing a normative justification for (a slight
generalization of) Martha White's RL task formalism (2017) and other recent
departures from the traditional RL, and is relevant to task specification in
RL, inverse RL and preference-based RL.Comment: 8 pages + 1 page supplement. In proceedings of AAAI 2019. Slides,
poster and bibtex available at
https://silviupitis.com/#rethinking-the-discount-factor-in-reinforcement-learning-a-decision-theoretic-approac
Four arguments for denying that lottery beliefs are justified
A ‘lottery belief’ is a belief that a particular ticket has lost a large, fair lottery, based on nothing more than the odds against it winning. The lottery paradox brings out a tension between the idea that lottery beliefs are justified and the idea that that one can always justifiably believe the deductive consequences of things that one justifiably believes – what is sometimes called the principle of closure. Many philosophers have treated the lottery paradox as an argument against the second idea – but I make a case here that it is the first idea that should be given up. As I shall show, there are a number of independent arguments for denying that lottery beliefs are justified
Rationality and dynamic consistency under risk and uncertainty
For choice with deterministic consequences, the standard rationality hypothesis is ordinality - i.e., maximization of a weak preference ordering. For choice under risk (resp. uncertainty), preferences are assumed to be represented by the objectively (resp. subjectively) expected value of a von Neumann{Morgenstern utility function. For choice under risk, this implies a key independence axiom; under uncertainty, it implies some version of Savage's sure thing principle. This chapter investigates the extent to which ordinality, independence, and the sure thing principle can be derived from more fundamental axioms concerning behaviour in decision trees. Following Cubitt (1996), these principles include dynamic consistency, separability, and reduction of sequential choice, which can be derived in turn from one consequentialist hypothesis applied to continuation subtrees as well as entire decision trees. Examples of behavior violating these principles are also reviewed, as are possible explanations of why such violations are often observed in experiments
Being in a Position to Know is the Norm of Assertion
This paper defends a new norm of assertion: Assert that p only if you are in a position to know that p. We test the norm by judging its performance in explaining three phenomena that appear jointly inexplicable at first: Moorean paradoxes, lottery propositions, and selfless assertions. The norm succeeds by tethering unassertability to unknowability while untethering belief from assertion. The PtK‐norm foregrounds the public nature of assertion as a practice that can be other‐regarding, allowing asserters to act in the best interests of their audience when psychological pressures would otherwise prevent them from communicating the knowable truth
Why people choose negative expected return assets - an empirical examination of a utility theoretic explanation
Using a theoretical extension of the Friedman and Savage (1948) utility function developed in Bhattacharyya (2003), we predict that for financial assets with negative expected returns, expected return will be a declining and convex function of skewness. Using a sample of U.S. state lottery games, we find that our theoretical conclusions are supported by the data. Our results have external validity as they also hold for an alternative and more aggregated sample of lottery game data.
- …