125 research outputs found

    Better safe than sorry: Risky function exploitation through safe optimization

    Get PDF
    Exploration-exploitation of functions, that is learning and optimizing a mapping between inputs and expected outputs, is ubiquitous to many real world situations. These situations sometimes require us to avoid certain outcomes at all cost, for example because they are poisonous, harmful, or otherwise dangerous. We test participants' behavior in scenarios in which they have to find the optimum of a function while at the same time avoid outputs below a certain threshold. In two experiments, we find that Safe-Optimization, a Gaussian Process-based exploration-exploitation algorithm, describes participants' behavior well and that participants seem to care firstly whether a point is safe and then try to pick the optimal point from all such safe points. This means that their trade-off between exploration and exploitation can be seen as an intelligent, approximate, and homeostasis-driven strategy.Comment: 6 pages, submitted to Cognitive Science Conferenc

    Serotonin, Inhibition, and Negative Mood

    Get PDF
    Pavlovian predictions of future aversive outcomes lead to behavioral inhibition, suppression, and withdrawal. There is considerable evidence for the involvement of serotonin in both the learning of these predictions and the inhibitory consequences that ensue, although less for a causal relationship between the two. In the context of a highly simplified model of chains of affectively charged thoughts, we interpret the combined effects of serotonin in terms of pruning a tree of possible decisions, (i.e., eliminating those choices that have low or negative expected outcomes). We show how a drop in behavioral inhibition, putatively resulting from an experimentally or psychiatrically influenced drop in serotonin, could result in unexpectedly large negative prediction errors and a significant aversive shift in reinforcement statistics. We suggest an interpretation of this finding that helps dissolve the apparent contradiction between the fact that inhibition of serotonin reuptake is the first-line treatment of depression, although serotonin itself is most strongly linked with aversive rather than appetitive outcomes and predictions

    Disentangling the Roles of Approach, Activation and Valence in Instrumental and Pavlovian Responding

    Get PDF
    Hard-wired, Pavlovian, responses elicited by predictions of rewards and punishments exert significant benevolent and malevolent influences over instrumentally-appropriate actions. These influences come in two main groups, defined along anatomical, pharmacological, behavioural and functional lines. Investigations of the influences have so far concentrated on the groups as a whole; here we take the critical step of looking inside each group, using a detailed reinforcement learning model to distinguish effects to do with value, specific actions, and general activation or inhibition. We show a high degree of sophistication in Pavlovian influences, with appetitive Pavlovian stimuli specifically promoting approach and inhibiting withdrawal, and aversive Pavlovian stimuli promoting withdrawal and inhibiting approach. These influences account for differences in the instrumental performance of approach and withdrawal behaviours. Finally, although losses are as informative as gains, we find that subjects neglect losses in their instrumental learning. Our findings argue for a view of the Pavlovian system as a constraint or prior, facilitating learning by alleviating computational costs that come with increased flexibility

    Action Dominates Valence in Anticipatory Representations in the Human Striatum and Dopaminergic Midbrain

    Get PDF
    The acquisition of reward and the avoidance of punishment could logically be contingent on either emitting or withholding particular actions. However,the separate pathways inthe striatumfor go and no-go appearto violatethis independence, instead coupling affect and effect. Respect for this interdependence has biased many studies of reward and punishment, so potential action- outcome valence interactions during anticipatory phases remain unexplored. In a functional magnetic resonance imaging study with healthy human volunteers, we manipulated subjects" requirement to emit or withhold an action independent from subsequent receipt of reward or avoidance of punishment. During anticipation, in the striatum and a lateral region within the substantia nigra/ventral tegmental area (SN/VTA), action representations dominated over valence representations. Moreover, we did not observe any representation associated with different state values through accumulation of outcomes, challenging a conventional and dominant association between these areas and state value representations. In contrast, a more medial sector of the SN/VTA responded preferentially to valence, with opposite signs depending on whether action was anticipatedto be emitted or withheld. This dominant influence of action requires an enriched notion of opponency between reward and punishment

    Susceptibility to interference between Pavlovian and instrumental control predisposes risky alcohol use developmental trajectory from ages 18 to 24

    Get PDF
    Pavlovian cues can influence ongoing instrumental behaviour via Pavlovian-to-instrumental transfer (PIT) processes. While appetitive Pavlovian cues tend to promote instrumental approach, they are detrimental when avoidance behaviour is required, and vice versa for aversive cues. We recently reported that susceptibility to interference between Pavlovian and instrumental control assessed via a PIT task was associated with risky alcohol use at age 18. We now investigated whether such susceptibility also predicts drinking trajectories until age 24, based on AUDIT (Alcohol Use Disorders Identification Test) consumption and binge drinking (gramme alcohol/drinking occasion) scores. The interference PIT effect, assessed at ages 18 and 21 during fMRI, was characterized by increased error rates (ER) and enhanced neural responses in the ventral striatum (VS), the lateral and dorsomedial prefrontal cortices (dmPFC) during conflict, that is, when an instrumental approach was required in the presence of an aversive Pavlovian cue or vice versa. We found that a stronger VS response during conflict at age 18 was associated with a higher starting point of both drinking trajectories but predicted a decrease in binge drinking. At age 21, high ER and enhanced neural responses in the dmPFC were associated with increasing AUDIT-C scores over the next 3 years until age 24. Overall, susceptibility to interference between Pavlovian and instrumental control might be viewed as a predisposing mechanism towards hazardous alcohol use during young adulthood, and the identified high-risk group may profit from targeted interventions

    Low predictive power of clinical features for relapse prediction after antidepressant discontinuation in a naturalistic setting

    Full text link
    The risk of relapse after antidepressant medication (ADM) discontinuation is high. Predictors of relapse could guide clinical decision-making, but are yet to be established. We assessed demographic and clinical variables in a longitudinal observational study before antidepressant discontinuation. State-dependent variables were re-assessed either after discontinuation or before discontinuation after a waiting period. Relapse was assessed during 6 months after discontinuation. We applied logistic general linear models in combination with least absolute shrinkage and selection operator and elastic nets to avoid overfitting in order to identify predictors of relapse and estimated their generalisability using cross-validation. The final sample included 104 patients (age: 34.86 (11.1), 77% female) and 57 healthy controls (age: 34.12 (10.6), 70% female). 36% of the patients experienced a relapse. Treatment by a general practitioner increased the risk of relapse. Although within-sample statistical analyses suggested reasonable sensitivity and specificity, out-of-sample prediction of relapse was at chance level. Residual symptoms increased with discontinuation, but did not relate to relapse. Demographic and standard clinical variables appear to carry little predictive power and therefore are of limited use for patients and clinicians in guiding clinical decision-making

    Processing speed enhances model-based over model-free reinforcement learning in the presence of high working memory functioning

    Get PDF
    Theories of decision-making and its neural substrates have long assumed the existence of two distinct and competing valuation systems, variously described as goal-directed versus habitual, or, more recently and based on statistical arguments, as model-free versus model-based reinforcement-learning. Though both have been shown to control choices, the cognitive abilities associated with these systems are under ongoing investigation. Here we examine the link to cognitive abilities, and find that individual differences in processing speed covary with a shift from model-free to model-based choice control in the presence of above-average working memory function. This suggests shared cognitive and neural processes; provides a bridge between literatures on intelligence and valuation; and may guide the development of process models of different valuation components. Furthermore, it provides a rationale for individual differences in the tendency to deploy valuation systems, which may be important for understanding the manifold neuropsychiatric diseases associated with malfunctions of valuation

    Go and No-go Learning in Reward and Punishment: Interactions between Affect and Effect

    Get PDF
    Decision-making invokes two fundamental axes of control: affect or valence, spanning reward and punishment, and effect or action, spanning invigoration and inhibition. We studied the acquisition of instrumental responding in healthy human volunteers in a task in which we orthogonalized action requirements and outcome valence. Subjects were much more successful in learning active choices in rewarded conditions, and passive choices in punished conditions. Using computational reinforcement-learning models, we teased apart contributions from putatively instrumental and Pavlovian components in the generation of the observed asymmetry during learning. Moreover, using model-based fMRI, we showed that BOLD signals in striatum and substantia nigra/ventral tegmental area (SN/VTA) correlated with instrumentally learnt action values, but with opposite signs for go and no-go choices. Finally, we showed that successful instrumental learning depends on engagement of bilateral inferior frontal gyrus. Our behavioral and computational data showed that instrumental learning is contingent on overcoming inherent and plastic Pavlovian biases, while our neuronal data showed this learning is linked to unique patterns of brain activity in regions implicated in action and inhibition respectively
    corecore