140 research outputs found

    Reinforcement learning or active inference?

    Get PDF
    This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain

    Organizational factors and depression management in community-based primary care settings

    Get PDF
    Abstract Background Evidence-based quality improvement models for depression have not been fully implemented in routine primary care settings. To date, few studies have examined the organizational factors associated with depression management in real-world primary care practice. To successfully implement quality improvement models for depression, there must be a better understanding of the relevant organizational structure and processes of the primary care setting. The objective of this study is to describe these organizational features of routine primary care practice, and the organization of depression care, using survey questions derived from an evidence-based framework. Methods We used this framework to implement a survey of 27 practices comprised of 49 unique offices within a large primary care practice network in western Pennsylvania. Survey questions addressed practice structure (e.g., human resources, leadership, information technology (IT) infrastructure, and external incentives) and process features (e.g., staff performance, degree of integrated depression care, and IT performance). Results The results of our survey demonstrated substantial variation across the practice network of organizational factors pertinent to implementation of evidence-based depression management. Notably, quality improvement capability and IT infrastructure were widespread, but specific application to depression care differed between practices, as did coordination and communication tasks surrounding depression treatment. Conclusions The primary care practices in the network that we surveyed are at differing stages in their organization and implementation of evidence-based depression management. Practical surveys such as this may serve to better direct implementation of these quality improvement strategies for depression by improving understanding of the organizational barriers and facilitators that exist within both practices and practice networks. In addition, survey information can inform efforts of individual primary care practices in customizing intervention strategies to improve depression management.http://deepblue.lib.umich.edu/bitstream/2027.42/78269/1/1748-5908-4-84.xmlhttp://deepblue.lib.umich.edu/bitstream/2027.42/78269/2/1748-5908-4-84-S1.PDFhttp://deepblue.lib.umich.edu/bitstream/2027.42/78269/3/1748-5908-4-84.pdfPeer Reviewe

    Part-time hospitalisation and stigma experiences: a study in contemporary psychiatric hospitals

    Get PDF
    Background: Because numerous studies have revealed the negative consequences of stigmatisation, this study explores the determinants of stigma experiences. In particular, it examines whether or not part-time hospitalisation in contemporary psychiatric hospitals is associated with less stigma experiences than full-time hospitalisation. Methods: Survey data on 378 clients of 42 wards from 8 psychiatric hospitals are used to compare full-time clients, part-time clients and clients receiving part-time care as aftercare on three dimensions of stigma experiences, while controlling for symptoms, diagnosis and clients' background characteristics. Results: The results reveal that part-time clients without previous full-time hospitalisation report less social rejection than clients who receive full-time hospitalisation. In contrast, clients receiving part-time treatment as aftercare do not differ significantly from full-time clients concerning social rejection. No significant results for the other stigma dimensions were found. Conclusion: Concerning social rejection, immediate part-time hospitalisation could be recommended as a means of destigmatisation for clients of contemporary psychiatric hospitals

    An Imperfect Dopaminergic Error Signal Can Drive Temporal-Difference Learning

    Get PDF
    An open problem in the field of computational neuroscience is how to link synaptic plasticity to system-level learning. A promising framework in this context is temporal-difference (TD) learning. Experimental evidence that supports the hypothesis that the mammalian brain performs temporal-difference learning includes the resemblance of the phasic activity of the midbrain dopaminergic neurons to the TD error and the discovery that cortico-striatal synaptic plasticity is modulated by dopamine. However, as the phasic dopaminergic signal does not reproduce all the properties of the theoretical TD error, it is unclear whether it is capable of driving behavior adaptation in complex tasks. Here, we present a spiking temporal-difference learning model based on the actor-critic architecture. The model dynamically generates a dopaminergic signal with realistic firing rates and exploits this signal to modulate the plasticity of synapses as a third factor. The predictions of our proposed plasticity dynamics are in good agreement with experimental results with respect to dopamine, pre- and post-synaptic activity. An analytical mapping from the parameters of our proposed plasticity dynamics to those of the classical discrete-time TD algorithm reveals that the biological constraints of the dopaminergic signal entail a modified TD algorithm with self-adapting learning parameters and an adapting offset. We show that the neuronal network is able to learn a task with sparse positive rewards as fast as the corresponding classical discrete-time TD algorithm. However, the performance of the neuronal network is impaired with respect to the traditional algorithm on a task with both positive and negative rewards and breaks down entirely on a task with purely negative rewards. Our model demonstrates that the asymmetry of a realistic dopaminergic signal enables TD learning when learning is driven by positive rewards but not when driven by negative rewards

    Model-based parametric study of frontostriatal abnormalities in schizophrenia patients

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Several studies have suggested that the activity of the prefrontal cortex (PFC) and the dopamine (DA) release in the striatum has an inverse relationship. One would attribute this relationship primarily to the circuitry comprised of the glutamatergic projection from the PFC to the striatum and the GABAergic projection from the striatum to the midbrain DA nucleus. However, this circuitry has not characterized satisfactorily yet, so that no quantitative analysis has ever been made on the activities of the PFC and the striatum and also the DA release in the striatum.</p> <p>Methods</p> <p>In this study, a system dynamics model of the corticostriatal system with dopaminergic innervations is constructed to describe the relationships between the activities of the PFC and the striatum and the DA release in the striatum. By taking published receptor imaging data from schizophrenia patients and healthy subjects into this model, this article analyzes the effects of striatal D2 receptor activation on the balance of the activity and neurotransmission in the frontostriatal system of schizophrenic patients in comparison with healthy controls.</p> <p>Results</p> <p>The model predicts that the suppressive effect by D2 receptors at the terminals of the glutamatergic afferents to the striatum from the PFC enhances the hypofrontality-induced elevation of striatal DA release by at most 83%. The occupancy-based estimation of the 'optimum' D2 receptor occupancy by antipsychotic drugs is 52%. This study further predicts that patients with lower PFC activity tend to have greater improvement of positive symptoms following antipsychotic medication.</p> <p>Conclusion</p> <p>This model-based parametric study would be useful for system-level analysis of the brains with psychiatric diseases. It will be able to make reliable prediction of clinical outcome when sufficient data will be available.</p

    Deconstructing Insight: EEG Correlates of Insightful Problem Solving

    Get PDF
    Background: Cognitive insight phenomenon lies at the core of numerous discoveries. Behavioral research indicates four salient features of insightful problem solving: (i) mental impasse, followed by (ii) restructuring of the problem representation, which leads to (iii) a deeper understanding of the problem, and finally culminates in (iv) an “Aha!” feeling of suddenness and obviousness of the solution. However, until now no efforts have been made to investigate the neural mechanisms of these constituent features of insight in a unified framework. Methodology/Principal Findings: In an electroencephalographic study using verbal remote associate problems, we identified neural correlates of these four features of insightful problem solving. Hints were provided for unsolved problems or after mental impasse. Subjective ratings of the restructuring process and the feeling of suddenness were obtained on trial-by-trial basis. A negative correlation was found between these two ratings indicating that sudden insightful solutions, where restructuring is a key feature, involve automatic, subconscious recombination of information. Electroencephalogram signals were analyzed in the space×time×frequency domain with a nonparametric cluster randomization test. First, we found strong gamma band responses at parieto-occipital regions which we interpreted as (i) an adjustment of selective attention (leading to a mental impasse or to a correct solution depending on the gamma band power level) and (ii) encoding and retrieval processes for the emergence of spontaneous new solutions. Secondly, we observed an increased upper alpha band response in right temporal regions (suggesting active suppression of weakly activated solution relevant information) for initially unsuccessful trials that after hint presentation led to a correct solution. Finally, for trials with high restructuring, decreased alpha power (suggesting greater cortical excitation) was observed in right prefrontal area. Conclusions/Significance: Our results provide a first account of cognitive insight by dissociating its constituent components and potential neural correlates

    Optogenetic Mimicry of the Transient Activation of Dopamine Neurons by Natural Reward Is Sufficient for Operant Reinforcement

    Get PDF
    Activation of dopamine receptors in forebrain regions, for minutes or longer, is known to be sufficient for positive reinforcement of stimuli and actions. However, the firing rate of dopamine neurons is increased for only about 200 milliseconds following natural reward events that are better than expected, a response which has been described as a “reward prediction error” (RPE). Although RPE drives reinforcement learning (RL) in computational models, it has not been possible to directly test whether the transient dopamine signal actually drives RL. Here we have performed optical stimulation of genetically targeted ventral tegmental area (VTA) dopamine neurons expressing Channelrhodopsin-2 (ChR2) in mice. We mimicked the transient activation of dopamine neurons that occurs in response to natural reward by applying a light pulse of 200 ms in VTA. When a single light pulse followed each self-initiated nose poke, it was sufficient in itself to cause operant reinforcement. Furthermore, when optical stimulation was delivered in separate sessions according to a predetermined pattern, it increased locomotion and contralateral rotations, behaviors that are known to result from activation of dopamine neurons. All three of the optically induced operant and locomotor behaviors were tightly correlated with the number of VTA dopamine neurons that expressed ChR2, providing additional evidence that the behavioral responses were caused by activation of dopamine neurons. These results provide strong evidence that the transient activation of dopamine neurons provides a functional reward signal that drives learning, in support of RL theories of dopamine function

    Coordinated Activity of Ventral Tegmental Neurons Adapts to Appetitive and Aversive Learning

    Get PDF
    Our understanding of how value-related information is encoded in the ventral tegmental area (VTA) is based mainly on the responses of individual putative dopamine neurons. In contrast to cortical areas, the nature of coordinated interactions between groups of VTA neurons during motivated behavior is largely unknown. These interactions can strongly affect information processing, highlighting the importance of investigating network level activity. We recorded the activity of multiple single units and local field potentials (LFP) in the VTA during a task in which rats learned to associate novel stimuli with different outcomes. We found that coordinated activity of VTA units with either putative dopamine or GABA waveforms was influenced differently by rewarding versus aversive outcomes. Specifically, after learning, stimuli paired with a rewarding outcome increased the correlation in activity levels between unit pairs whereas stimuli paired with an aversive outcome decreased the correlation. Paired single unit responses also became more redundant after learning. These response patterns flexibly tracked the reversal of contingencies, suggesting that learning is associated with changing correlations and enhanced functional connectivity between VTA neurons. Analysis of LFP recorded simultaneously with unit activity showed an increase in the power of theta oscillations when stimuli predicted reward but not an aversive outcome. With learning, a higher proportion of putative GABA units were phase locked to the theta oscillations than putative dopamine units. These patterns also adapted when task contingencies were changed. Taken together, these data demonstrate that VTA neurons organize flexibly as functional networks to support appetitive and aversive learning

    Advances in estimation by the item sum technique using auxiliary information in complex surveys

    Get PDF
    To collect sensitive data, survey statisticians have designed many strategies to reduce nonresponse rates and social desirability response bias. In recent years, the item count technique (ICT) has gained considerable popularity and credibility as an alternative mode of indirect questioning survey, and several variants of this technique have been proposed as new needs and challenges arise. The item sum technique (IST), which was introduced by Chaudhuri and Christofides (2013) and Trappmann et al. (2014), is one such variant, used to estimate the mean of a sensitive quantitative variable. In this approach, sampled units are asked to respond to a two-list of items containing a sensitive question related to the study variable and various innocuous, nonsensitive, questions. To the best of our knowledge, very few theoretical and applied papers have addressed the IST. In this article, therefore, we present certain methodological advances as a contribution to appraising the use of the IST in real-world surveys. In particular, we employ a generic sampling design to examine the problem of how to improve the estimates of the sensitive mean when auxiliary information on the population under study is available and is used at the design and estimation stages. A Horvitz-Thompson type estimator and a calibration type estimator are proposed and their efficiency is evaluated by means of an extensive simulation study. Using simulation experiments, we show that estimates obtained by the IST are nearly equivalent to those obtained using “true data” and that in general they outperform the estimates provided by a competitive randomized response method. Moreover, the variance estimation may be considered satisfactory. These results open up new perspectives for academics, researchers and survey practitioners, and could justify the use of the IST as a valid alternative to traditional direct questioning survey modes.Ministerio de Economía y Competitividad of SpainMinisterio de Educacion, Cultura y Deporteproject PRIN-SURWE

    Effort-related functions of nucleus accumbens dopamine and associated forebrain circuits

    Get PDF
    Background Over the last several years, it has become apparent that there are critical problems with the hypothesis that brain dopamine (DA) systems, particularly in the nucleus accumbens, directly mediate the rewarding or primary motivational characteristics of natural stimuli such as food. Hypotheses related to DA function are undergoing a substantial restructuring, such that the classic emphasis on hedonia and primary reward is giving way to diverse lines of research that focus on aspects of instrumental learning, reward prediction, incentive motivation, and behavioral activation. Objective The present review discusses dopaminergic involvement in behavioral activation and, in particular, emphasizes the effort-related functions of nucleus accumbens DA and associated forebrain circuitry. Results The effects of accumbens DA depletions on food-seeking behavior are critically dependent upon the work requirements of the task. Lever pressing schedules that have minimal work requirements are largely unaffected by accumbens DA depletions, whereas reinforcement schedules that have high work (e.g., ratio) requirements are substantially impaired by accumbens DA depletions. Moreover, interference with accumbens DA transmission exerts a powerful influence over effort-related decision making. Rats with accumbens DA depletions reallocate their instrumental behavior away from food-reinforced tasks that have high response requirements, and instead, these rats select a less-effortful type of food-seeking behavior. Conclusions Along with prefrontal cortex and the amygdala, nucleus accumbens is a component of the brain circuitry regulating effort-related functions. Studies of the brain systems regulating effort-based processes may have implications for understanding drug abuse, as well as energy-related disorders such as psychomotor slowing, fatigue, or anergia in depression
    corecore