11,839 research outputs found

    Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

    Get PDF
    Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.Comment: updated version, incorporating conference feedbac

    The motivation and pleasure dimension of negative symptoms: neural substrates and behavioral outputs.

    Get PDF
    A range of emotional and motivation impairments have long been clinically documented in people with schizophrenia, and there has been a resurgence of interest in understanding the psychological and neural mechanisms of the so-called "negative symptoms" in schizophrenia, given their lack of treatment responsiveness and their role in constraining function and life satisfaction in this illness. Negative symptoms comprise two domains, with the first covering diminished motivation and pleasure across a range of life domains and the second covering diminished verbal and non-verbal expression and communicative output. In this review, we focus on four aspects of the motivation/pleasure domain, providing a brief review of the behavioral and neural underpinnings of this domain. First, we cover liking or in-the-moment pleasure: immediate responses to pleasurable stimuli. Second, we cover anticipatory pleasure or wanting, which involves prediction of a forthcoming enjoyable outcome (reward) and feeling pleasure in anticipation of that outcome. Third, we address motivation, which comprises effort computation, which involves figuring out how much effort is needed to achieve a desired outcome, planning, and behavioral response. Finally, we cover the maintenance emotional states and behavioral responses. Throughout, we consider the behavioral manifestations and brain representations of these four aspects of motivation/pleasure deficits in schizophrenia. We conclude with directions for future research as well as implications for treatment

    Identity, Dignity and Taboos: Beliefs as Assets

    Get PDF
    We analyze social and economic phenomena involving beliefs which people value and invest in, for affective or functional reasons. Individuals are at times uncertain about their own deep values and infer them from their past choices, which then come to define who they are. Identity investments increase when information is scarce or when a greater endowment of some asset (wealth, career, family, culture) raises the stakes on viewing it as valuable (escalating commitments). Taboos against transactions or the mere contemplation of tradeoffs arise to protect fragile beliefs about the priceless value of certain assets (life, freedom, love, faith) or things one would never do. Whether such behaviors are welfare-enhancing or reducing depends on whether beliefs are sought for a functional value (sense of direction, self-discipline) or for mental consumption motives (self-esteem, anticipatory feelings). Escalating commitments can thus lead to a hedonic treadmill, and competing identities cause dysfunctional failures to invest in high-return activities (education, adapting to globalization, assimilation), or even the destruction of productive assets. In social interactions, norm violations elicit a forceful response (exclusion, harassment) when they threaten a strongly held identity, but further erode morale when it was initially weak. Concerns for pride, dignity or wishful thinking lead to the inefficient breakdown of Coasian bargaining even under symmetric information, as partners seek to self-enhance and shift blame by turning down insultingly low offers.identity, self-serving beliefs, self-image, memory, wishful thinking, anticipatory utility, self control, hedonic treadmill, inefficient bargaining, taboos, religion.

    Imaginary relish and exquisite torture: The elaborated intrusion theory of desire

    Get PDF
    The authors argue that human desire involves conscious cognition that has strong affective connotation and is potentially involved in the determination of appetitive behavior rather than being epiphenomenal to it. Intrusive thoughts about appetitive targets are triggered automatically by external or physiological cues and by cognitive associates. When intrusions elicit significant pleasure or relief, cognitive elaboration usually ensues. Elaboration competes with concurrent cognitive tasks through retrieval of target-related information and its retention in working memory. Sensory images are especially important products of intrusion and elaboration because they simulate the sensory and emotional qualities of target acquisition. Desire images are momentarily rewarding but amplify awareness of somatic and emotional deficits. Effects of desires on behavior are moderated by competing incentives, target availability, and skills. The theory provides a coherent account of existing data and suggests new directions for research and treatment

    Optimal Expectations

    Get PDF
    This paper introduces a tractable, structural model of subjective beliefs. Since agents that plan for the future care about expected future utility flows, current felicity can be increased by believing that better outcomes are more likely. On the other hand, expectations that are biased towards optimism worsen decision making, leading to poorer realized outcomes on average. Optimal expectations balance these forces by maximizing the total well-being of an agent over time. We apply our framework of optimal expectations to three different economic settings. In a portfolio choice problem, agents overestimate the return of their investment and underdiversify. In general equilibrium, agents' prior beliefs are endogenously heterogeneous, leading to gambling. Second, in a consumption-saving problem with stochastic income, agents are both overconfident and overoptimistic, and consume more than implied by rational beliefs early in life. Third, in choosing when to undertake a single task with an uncertain cost, agents exhibit several features of procrastination, including regret, intertemporal preference reversal, and a greater readiness to accept commitment.expectations formation, beliefs, overconfidence

    Overlapping neural systems represent cognitive effort and reward anticipation

    Get PDF
    Anticipating a potential benefit and how difficult it will be to obtain it are valuable skills in a constantly changing environment. In the human brain, the anticipation of reward is encoded by the Anterior Cingulate Cortex (ACC) and Striatum. Naturally, potential rewards have an incentive quality, resulting in a motivational effect improving performance. Recently it has been proposed that an upcoming task requiring effort induces a similar anticipation mechanism as reward, relying on the same cortico-limbic network. However, this overlapping anticipatory activity for reward and effort has only been investigated in a perceptual task. Whether this generalizes to high-level cognitive tasks remains to be investigated. To this end, an fMRI experiment was designed to investigate anticipation of reward and effort in cognitive tasks. A mental arithmetic task was implemented, manipulating effort (difficulty), reward, and delay in reward delivery to control for temporal confounds. The goal was to test for the motivational effect induced by the expectation of bigger reward and higher effort. The results showed that the activation elicited by an upcoming difficult task overlapped with higher reward prospect in the ACC and in the striatum, thus highlighting a pivotal role of this circuit in sustaining motivated behavior

    Ideology

    Get PDF
    I develop a model of ideologies as collectively sustained (yet individually rational) distortions in beliefs concerning the proper scope of governments versus markets. In processing and interpreting signals of the efficacy of public and market provision of education, health insurance, pensions, etc., individuals optimally trade off the value of remaining hopeful about their future prospects (or their children’s) versus the costs of misinformed decisions. Because these future outcomes also depend on whether other citizens respond to unpleasant facts with realism or denial, endogenous social cognitions emerge. Thus, an equilibrium in which people acknowledge the limitations of interventionism coexists with one in which they remain obstinately blind to them, embracing a statist ideology and voting for an excessively large government. Conversely, an equilibrium associated with appropriate public responses to market failures coexists with one dominated by a laissez-faire ideology and blind faith in the invisible hand. With public-sector capital, this interplay of beliefs and institutions leads to history-dependent dynamics. The model also explains why societies find it desirable to set up constitutional protections for dissenting views, even when ex-post everyone would prefer to ignore unwelcome news.ideology, statism, laissez-faire, cognitive dissonance, wishful thinking, institutions, political economy, psychology
    corecore