1,720 research outputs found

    Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach

    Full text link
    Reinforcement learning (RL) agents have traditionally been tasked with maximizing the value function of a Markov decision process (MDP), either in continuous settings, with fixed discount factor γ<1\gamma < 1, or in episodic settings, with γ=1\gamma = 1. While this has proven effective for specific tasks with well-defined objectives (e.g., games), it has never been established that fixed discounting is suitable for general purpose use (e.g., as a model of human preferences). This paper characterizes rationality in sequential decision making using a set of seven axioms and arrives at a form of discounting that generalizes traditional fixed discounting. In particular, our framework admits a state-action dependent "discount" factor that is not constrained to be less than 1, so long as there is eventual long run discounting. Although this broadens the range of possible preference structures in continuous settings, we show that there exists a unique "optimizing MDP" with fixed γ<1\gamma < 1 whose optimal value function matches the true utility of the optimal policy, and we quantify the difference between value and utility for suboptimal policies. Our work can be seen as providing a normative justification for (a slight generalization of) Martha White's RL task formalism (2017) and other recent departures from the traditional RL, and is relevant to task specification in RL, inverse RL and preference-based RL.Comment: 8 pages + 1 page supplement. In proceedings of AAAI 2019. Slides, poster and bibtex available at https://silviupitis.com/#rethinking-the-discount-factor-in-reinforcement-learning-a-decision-theoretic-approac

    Status and Incentives

    Get PDF
    The paper introduces status as reflecting an agent's claim to recognition in her work. It is a scarce resource: increasing an agent's status requires that another agent's status is decreased. Higher status agents are more willing to exert effort in exchange for money; better-paid agents would exert a higher effort in exchange for an improved status. Results are coherent with actual management practices: (i) egalitarianism is desirable in a static context; (ii) in a long-term work relationship, juniors' compensations are delayed; past performances are recompensed by pay increases along with an improved status within the organization's hierarchy.repeated moral hazard, internal labor markets, social status

    The preponderance of decision in a new managerial function of information – decision

    Get PDF
    The decision preponderate over information in a new central function of management defined as informationdecision; we believe that the option for a compromise of the type: prognosis of product or service, organization, information-decision, stimulation and control better responds to the new managerial conditions. There frequently occur deadlocks in modelling decisions, especially owing to the lack of information (quality of the data, equations, the degre of accuracy etc.), but we believe that the option for a better decision, sometimes even instead of a better information, finally, means an optimal solution to short term.managerial information and decision, mathematical hope, prudence, moderate, superoptimistic, equilibrium and regrets rule, decision trees

    Ambiguous correlation

    Full text link
    Many decisions are made in environments where outcomes are determined by the realization of multiple random events. A decision maker may be uncertain how these events are related. We identify and experimentally substantiate behavior that intuitively reflects a lack of confidence in their joint distribution. Our findings suggest a dimension of ambiguity which is different from that in the classical distinction between risk and "Knightian uncertainty"

    Intertemporal substitution and recursive smooth ambiguity preferences

    Get PDF
    In this paper, we establish an axiomatically founded generalized recursive smooth ambiguity model that allows for a separation among intertemporal substitution, risk aversion, and ambiguity aversion. We axiomatize this model using two approaches: the second-order act approach à la Klibanoff, Marinacci, and Mukerji (2005) and the two-stage randomization approach à la Seo (2009). We characterize risk attitude and ambiguity attitude within these two approaches. We then discuss our model's application in asset pricing. Our recursive preference model nests some popular models in the literature as special cases.Ambiguity, ambiguity aversion, risk aversion, intertemporal substitution, model uncertainty, recursive utility, dynamic consistency

    Quantum Probabilities as Behavioral Probabilities

    Full text link
    We demonstrate that behavioral probabilities of human decision makers share many common features with quantum probabilities. This does not imply that humans are some quantum objects, but just shows that the mathematics of quantum theory is applicable to the description of human decision making. The applicability of quantum rules for describing decision making is connected with the nontrivial process of making decisions in the case of composite prospects under uncertainty. Such a process involves deliberations of a decision maker when making a choice. In addition to the evaluation of the utilities of considered prospects, real decision makers also appreciate their respective attractiveness. Therefore, human choice is not based solely on the utility of prospects, but includes the necessity of resolving the utility-attraction duality. In order to justify that human consciousness really functions similarly to the rules of quantum theory, we develop an approach defining human behavioral probabilities as the probabilities determined by quantum rules. We show that quantum behavioral probabilities of humans not merely explain qualitatively how human decisions are made, but they predict quantitative values of the behavioral probabilities. Analyzing a large set of empirical data, we find good quantitative agreement between theoretical predictions and observed experimental data.Comment: Latex file, 32 page

    Personality Psychology and Economics

    Get PDF
    This paper explores the power of personality traits both as predictors and as causes of academic and economic success, health, and criminal activity. Measured personality is interpreted as a construct derived from an economic model of preferences, constraints, and information. Evidence is reviewed about the "situational specificity" of personality traits and preferences. An extreme version of the situationist view claims that there are no stable personality traits or preference parameters that persons carry across different situations. Those who hold this view claim that personality psychology has little relevance for economics. The biological and evolutionary origins of personality traits are explored. Personality measurement systems and relationships among the measures used by psychologists are examined. The predictive power of personality measures is compared with the predictive power of measures of cognition captured by IQ and achievement tests. For many outcomes, personality measures are just as predictive as cognitive measures, even after controlling for family background and cognition. Moreover, standard measures of cognition are heavily influenced by personality traits and incentives. Measured personality traits are positively correlated over the life cycle. However, they are not fixed and can be altered by experience and investment. Intervention studies, along with studies in biology and neuroscience, establish a causal basis for the observed effect of personality traits on economic and social outcomes. Personality traits are more malleable over the life cycle compared to cognition, which becomes highly rank stable around age 10. Interventions that change personality are promising avenues for addressing poverty and disadvantage.personality, behavioral economics, cognitive traits, wages, economic success, human development, person-situation debate
    corecore