7,643 research outputs found

    Lotteries and justification

    Get PDF
    The lottery paradox shows that the following three individually highly plausible theses are jointly incompatible: (i) highly probable propositions are justifiably believable, (ii) justified believability is closed under conjunction introduction, (iii) known contradictions are not justifiably believable. This paper argues that a satisfactory solution to the lottery paradox must reject (i) as versions of the paradox can be generated without appeal to either (ii) or (iii) and proposes a new solution to the paradox in terms of a novel account of justified believability

    Lotteries, Possibility and Skepticism

    Get PDF

    What else justification could be

    Get PDF
    According to a captivating picture, epistemic justification is essentially a matter of epistemic or evidential likelihood. While certain problems for this view are well known, it is motivated by a very natural thought—if justification can fall short of epistemic certainty, then what else could it possibly be? In this paper I shall develop an alternative way of thinking about epistemic justification. On this conception, the difference between justification and likelihood turns out to be akin to the more widely recognised difference between ceteris paribus laws and brute statistical generalisations. I go on to discuss, in light of this suggestion, issues such as classical and lottery-driven scepticism as well as the lottery and preface paradoxes

    Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach

    Full text link
    Reinforcement learning (RL) agents have traditionally been tasked with maximizing the value function of a Markov decision process (MDP), either in continuous settings, with fixed discount factor γ<1\gamma < 1, or in episodic settings, with γ=1\gamma = 1. While this has proven effective for specific tasks with well-defined objectives (e.g., games), it has never been established that fixed discounting is suitable for general purpose use (e.g., as a model of human preferences). This paper characterizes rationality in sequential decision making using a set of seven axioms and arrives at a form of discounting that generalizes traditional fixed discounting. In particular, our framework admits a state-action dependent "discount" factor that is not constrained to be less than 1, so long as there is eventual long run discounting. Although this broadens the range of possible preference structures in continuous settings, we show that there exists a unique "optimizing MDP" with fixed γ<1\gamma < 1 whose optimal value function matches the true utility of the optimal policy, and we quantify the difference between value and utility for suboptimal policies. Our work can be seen as providing a normative justification for (a slight generalization of) Martha White's RL task formalism (2017) and other recent departures from the traditional RL, and is relevant to task specification in RL, inverse RL and preference-based RL.Comment: 8 pages + 1 page supplement. In proceedings of AAAI 2019. Slides, poster and bibtex available at https://silviupitis.com/#rethinking-the-discount-factor-in-reinforcement-learning-a-decision-theoretic-approac

    Four arguments for denying that lottery beliefs are justified

    Get PDF
    A ‘lottery belief’ is a belief that a particular ticket has lost a large, fair lottery, based on nothing more than the odds against it winning. The lottery paradox brings out a tension between the idea that lottery beliefs are justified and the idea that that one can always justifiably believe the deductive consequences of things that one justifiably believes – what is sometimes called the principle of closure. Many philosophers have treated the lottery paradox as an argument against the second idea – but I make a case here that it is the first idea that should be given up. As I shall show, there are a number of independent arguments for denying that lottery beliefs are justified

    Rationality and dynamic consistency under risk and uncertainty

    Get PDF
    For choice with deterministic consequences, the standard rationality hypothesis is ordinality - i.e., maximization of a weak preference ordering. For choice under risk (resp. uncertainty), preferences are assumed to be represented by the objectively (resp. subjectively) expected value of a von Neumann{Morgenstern utility function. For choice under risk, this implies a key independence axiom; under uncertainty, it implies some version of Savage's sure thing principle. This chapter investigates the extent to which ordinality, independence, and the sure thing principle can be derived from more fundamental axioms concerning behaviour in decision trees. Following Cubitt (1996), these principles include dynamic consistency, separability, and reduction of sequential choice, which can be derived in turn from one consequentialist hypothesis applied to continuation subtrees as well as entire decision trees. Examples of behavior violating these principles are also reviewed, as are possible explanations of why such violations are often observed in experiments

    Being in a Position to Know is the Norm of Assertion

    Get PDF
    This paper defends a new norm of assertion: Assert that p only if you are in a position to know that p. We test the norm by judging its performance in explaining three phenomena that appear jointly inexplicable at first: Moorean paradoxes, lottery propositions, and selfless assertions. The norm succeeds by tethering unassertability to unknowability while untethering belief from assertion. The PtK‐norm foregrounds the public nature of assertion as a practice that can be other‐regarding, allowing asserters to act in the best interests of their audience when psychological pressures would otherwise prevent them from communicating the knowable truth

    Why people choose negative expected return assets - an empirical examination of a utility theoretic explanation

    Get PDF
    Using a theoretical extension of the Friedman and Savage (1948) utility function developed in Bhattacharyya (2003), we predict that for financial assets with negative expected returns, expected return will be a declining and convex function of skewness. Using a sample of U.S. state lottery games, we find that our theoretical conclusions are supported by the data. Our results have external validity as they also hold for an alternative and more aggregated sample of lottery game data.
    corecore