16,742 research outputs found
Neural Dynamics Underlying Impaired Autonomic and Conditioned Responses Following Amygdala and Orbitofrontal Lesions
A neural model is presented that explains how outcome-specific learning modulates affect, decision-making and Pavlovian conditioned approach responses. The model addresses how brain regions responsible for affective learning and habit learning interact, and answers a central question: What are the relative contributions of the amygdala and orbitofrontal cortex to emotion and behavior? In the model, the amygdala calculates outcome value while the orbitofrontal cortex influences attention and conditioned responding by assigning value information to stimuli. Model simulations replicate autonomic, electrophysiological, and behavioral data associated with three tasks commonly used to assay these phenomena: Food consumption, Pavlovian conditioning, and visual discrimination. Interactions of the basal ganglia and amygdala with sensory and orbitofrontal cortices enable the model to replicate the complex pattern of spared and impaired behavioral and emotional capacities seen following lesions of the amygdala and orbitofrontal cortex.National Science Foundation (SBE-0354378; IIS-97-20333); Office of Naval Research (N00014-01-1-0624); Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-1-0409); National Institutes of Health (R29-DC02952
Recommended from our members
Habits without values
Habits form a crucial component of behavior. In recent years, key computational models have conceptualized habits as arising from model-free reinforcement learning (RL) mechanisms, which typically select between available actions based on the future value expected to result from each. Traditionally, however, habits have been understood as behaviors that can be triggered directly by a stimulus, without requiring the animal to evaluate expected outcomes. Here, we develop a computational model instantiating this traditional view, in which habits develop through the direct strengthening of recently taken actions rather than through the encoding of outcomes. We demonstrate that this model accounts for key behavioral manifestations of habits, including insensitivity to outcome devaluation and contingency degradation, as well as the effects of reinforcement schedule on the rate of habit formation. The model also explains the prevalent observation of perseveration in repeated-choice tasks as an additional behavioral manifestation of the habit system. We suggest that mapping habitual behaviors onto value-free mechanisms provides a parsimonious account of existing behavioral and neural data. This mapping may provide a new foundation for building robust and comprehensive models of the interaction of habits with other, more goal-directed types of behaviors and help to better guide research into the neural mechanisms underlying control of instrumental behavior more generally
Investigating Habits: Strategies,Technologies and Models
Understanding habits at a biological level requires a combination of behavioral observations and measures of ongoing neural activity. Theoretical frameworks as well as definitions of habitual behaviors emerging from classic behavioral research have been enriched by new approaches taking account of the identification of brain regions and circuits related to habitual behavior. Together, this combination of experimental and theoretical work has provided key insights into how brain circuits underlying action-learning and action-selection are organized, and how a balance between behavioral flexibility and fixity is achieved. New methods to monitor and manipulate neural activity in real time are allowing us to have a first look under the hood of a habit as it is formed and expressed. Here we discuss ideas emerging from such approaches. We pay special attention to the unexpected findings that have arisen from our own experiments suggesting that habitual behaviors likely require the simultaneous activity of multiple distinct components, or operators, seen as responsible for the contrasting dynamics of neural activity in both cortico-limbic and sensorimotor circuits recorded concurrently during different stages of habit learning. The neural dynamics identified thus far do not fully meet expectations derived from traditional models of the structure of habits, and the behavioral measures of habits that we have made also are not fully aligned with these models. We explore these new clues as opportunities to refine an understanding of habits
Theory-based Habit Modeling for Enhancing Behavior Prediction
Psychological theories of habit posit that when a strong habit is formed
through behavioral repetition, it can trigger behavior automatically in the
same environment. Given the reciprocal relationship between habit and behavior,
changing lifestyle behaviors (e.g., toothbrushing) is largely a task of
breaking old habits and creating new and healthy ones. Thus, representing
users' habit strengths can be very useful for behavior change support systems
(BCSS), for example, to predict behavior or to decide when an intervention
reaches its intended effect. However, habit strength is not directly observable
and existing self-report measures are taxing for users. In this paper, built on
recent computational models of habit formation, we propose a method to enable
intelligent systems to compute habit strength based on observable behavior. The
hypothesized advantage of using computed habit strength for behavior prediction
was tested using data from two intervention studies, where we trained
participants to brush their teeth twice a day for three weeks and monitored
their behaviors using accelerometers. Through hierarchical cross-validation, we
found that for the task of predicting future brushing behavior, computed habit
strength clearly outperformed self-reported habit strength (in both studies)
and was also superior to models based on past behavior frequency (in the larger
second study). Our findings provide initial support for our theory-based
approach of modeling user habits and encourages the use of habit computation to
deliver personalized and adaptive interventions
Habit formation limits growth in teacher effectiveness: A review of converging evidence from neuroscience and social science
Teachers become rapidly more effective during the early years of their career but tend to improve increasingly slowly thereafter. This article reviews and synthesises converging evidence from neuroscience, psychology, economics and education suggesting that teachersâ rate of growth slows because their practice becomes habitual. First, we review evidence suggesting that teaching is highly conducive to habit formation and that teachers display characteristic features of habitual behaviour. Next, we review empirical findings that performance asymptotes, as seen in teachersâ learning curves, coincide with the reallocation of behaviour regulation to neural circuits governing habitual behaviour. Finally, original data is presented showing that teachersâ behaviour becomes automatic around the time that teacher effectiveness begins to level off. Collectively, this evidence implies that professional development should involve repeated practice in realistic settings in order to overwrite and upgrade existing habits
Differential Dynamics of Activity Changes in Dorsolateral and Dorsomedial Striatal Loops during Learning
The basal ganglia are implicated in a remarkable range of functions influencing emotion and cognition as well as motor behavior. Current models of basal ganglia function hypothesize that parallel limbic, associative, and motor cortico-basal ganglia loops contribute to this diverse set of functions, but little is yet known about how these loops operate and how their activities evolve during learning. To address these issues, we recorded simultaneously in sensorimotor and associative regions of the striatum as rats learned different versions of a conditional T-maze task. We found highly contrasting patterns of activity in these regions during task performance and found that these different patterns of structured activity developed concurrently, but with sharply different dynamics. Based on the region-specific dynamics of these patterns across learning, we suggest a working model whereby dorsomedial associative loops can modulate the access of dorsolateral sensorimotor loops to the control of action.National Institutes of Health (U.S.) (MH60379)United States. Office of Naval Research (N000140410208)Stanley H. and Sheila G. Sydney FundEuropean Union (Grant 201716)McGovern Institute for Brain Research at MIT (Fellowship
Exploring model-based and model-free reinforcement learning in obsessive-compulsive disorder
RESUMO: A Perturbação Obsessivo-Compulsiva (POC) é uma doença neuropsiquiåtrica
comum, grave e incapacitante, para a qual os tratamentos actuais sĂŁo ineficazes num
grande nĂșmero de casos. O instrumento mais utilizado para avaliar a gravidade de
sintomas obsessivo-compulsivos Ă© a Yale-Brown Obsessive-Compulsive Scale (YBOCS), que foi recentemente revista (Y-BOCS-II). No entanto, a sua validade de
construto (tanto divergente como convergente) tem sido reportada como moderada e
a sua validade de critério para diagnóstico de POC nunca foi testada. No primeiro
capĂtulo desta tese testei, pela primeira vez, a validade de critĂ©rio da Y-BOCS-II e
demonstrei que um ponto de corte de 13 (pontuação total) atinge o melhor balanço
entre sensibilidade e especificidade para o diagnĂłstico de POC. No entanto, confirmei
que a sua validade divergente estĂĄ longe de ser excelente. Este Ășltimo achado levoume a procurar outros potenciais marcadores de POC.
TĂȘm sido demonstradas vĂĄrias anomalias em doentes com POC utilizando
tarefas neuropsicológicas ou técnicas de neuroimagem. Contudo, não existe ainda
um marcador consistente para esta perturbação, que seja capaz de discriminar
eficazmente pacientes que sofrem de POC, que seja sensĂvel Ă mudança apĂłs
intervençÔes terapĂȘuticas e para o qual seja possĂvel estabelecer uma
correspondĂȘncia com circuitos ou função cerebral. Uma abordagem que tem sido
seguida nos Ășltimos anos considera a POC como sendo caracterizada por uma
disfunção nos sistemas cerebrais responsåveis pela aprendizagem de acçÔes. As
tarefas de decisĂŁo sequencial emergiram recentemente como um instrumento
importante e sofisticado para estudar a aprendizagem de acçÔes em humanos através
da abordagem de reinforcement learning (RL). De acordo com a teoria subjacente ao
RL, as acçÔes podem ser aprendidas de duas formas distintas: um sistema modelbased funciona através da construção de um modelo interno das dinùmicas do
ambiente e utiliza esse modelo para planear trajectĂłrias comportamentais futuras, por
oposição a um sistema model-free, que funciona armazenando o valor estimado das
acçÔes que foram implementadas recentemente e actualizando essas estimativas por
tentativa e erro. As chamadas tarefas de decisĂŁo sequencial tĂȘm vindo a ser utilizadas
para estabelecer associaçÔes entre disfunção de sistemas cerebrais de RL e algumas
perturbaçÔes neuropsiquiĂĄtricas, como a POC, sendo que um desequilĂbrio entre os
sistemas model-based e model-free tem sido descrito. Através da aplicação de uma
dessas tarefas de decisĂŁo sequencial, a two-step task, existe evidĂȘncia que sugere
que os doentes com POC tĂȘm um dĂ©fice no sistema model-based. No entanto, neste
paradigma em particular, antes de desempenhar esta tarefa os indivĂduos recebem
informação detalhada sobre a estrutura da mesma. Assim, não é claro como os dois
principais sistemas de RL interagem quando os indivĂduos aprendem exclusivamente
atravĂ©s de interacção com o ambiente e como a informação explĂcita afecta as
estratĂ©gias de RL. No segundo capĂtulo desta tese, desenvolvi uma nova tarefa de
decisĂ”es sequenciais que permite nĂŁo sĂł quantificar o uso de estratĂ©gias modelbased RL e model-free RL, mas tambĂ©m diferenciar entre o impacto do conhecimento explĂcito da estrutura da tarefa e o impacto da experiĂȘncia na mesma. Os resultados
da aplicação da tarefa em indivĂduos saudĂĄveis demonstram que inicialmente a
escolha de acçÔes é controlada por aprendizagem model-free, com a aprendizagem
model-based emergindo apenas numa minoria de indivĂduos depois de experiĂȘncia
significativa com a tarefa, nĂŁo emergindo de todo em indivĂduos com POC, que por
sua vez mostraram tendĂȘncia para aumentar o uso de model-free RL com a
experiĂȘncia. Quando foi dada informação explĂcita sobre a estrutura da tarefa,
observou-se um aumento dramĂĄtico do uso de aprendizagem model-based, tanto nos
voluntĂĄrios saudĂĄveis como em ambos os grupos clĂnicos. A informação explĂcita
diminuiu o uso do sistema de aprendizagem model-free nos voluntĂĄrios saudĂĄveis e
nos pacientes com perturbação do humor e ansiedade, mas essa diminuição não foi
estatisticamente significativa no grupo de doentes com POC. Para além disso, depois
das instruçÔes, verificou-se em todos os grupos que a actualização do valor das
acçÔes aprendidas através do sistema model-free passou a ser mais influenciada
pelo valor dos estados atingidos e menos influenciada pela consequĂȘncia dos
ensaios. Outro efeito da informação explĂcita sobre a estrutura da tarefa nos
indivĂduos saudĂĄveis foi tornar as escolhas mais perseverantes, o que Ă© consistente
com uma modificação da estratégia de exploração. Estes resultados ajudam a
clarificar o perfil de utilização de estratégias de RL dos pacientes com POC, que
apresentam dĂ©fice inespecĂficos de aprendizagem model-based e achados mais
especĂficos de maior uso de aprendizagem model-free, em ambos os casos antes de
obterem informação sobrea estrutura da tarefa.
Por fim, como a literatura ainda não é consensual sobre a interação entre um
eventual sistema de model-based RL e um sistema de model-free RL nos circuitos
cerebrais em humanos, devenvolvi um protocolo de ressonùncia magnética funcional
para avaliar a escolha de ação sequencial com e sem instruçÔes. Os resultados
preliminares, em indivĂduos saudĂĄveis, sugerem que a reduced two-step task permite
separar comportamento que utiliza aprendizagem predominantemente model-free
(antes das instruçÔes) de comportamento que utiliza aprendizagem
predominantemente model-based (apĂłs as instruçÔes), no mesmo indivĂduo,
estrutura da tarefa e ambiente. A anĂĄlise dos dados de imagem funcional sugere que
o conhecimento explĂcito sobre a estrutura da tarefa modifica a atividade neuronal no
córtex paracingulado (cortex prefrontal medial) durante a transição do primeiro para
o segundo passo da tarefa. Objectivos futuros incluem o uso de técnicas de anålise
multivariada para explorar a representação cerebral dos estados da tarefa e a
aplicação deste protocolo de ressonùncia magnética funcional em populaçÔes
clĂnicas.ABSTRACT: Obsessive-compulsive disorder (OCD) is a common, chronic and disabling
neuropsychiatric condition for which current treatments are ineffective in a large
proportion of cases. The gold-standard instrument to assess the severity of OCD
symptoms is the Yale-Brown Obsessive-Compulsive Scale (Y-BOCS), which was
recently revised (Y-BOCS-II). However, its construct validity has been reported has
moderate and its criterion-related validity for the diagnosis of OCD has never been
tested. In the first chapter of this dissertation, I tested, for the first time, criterion-related
validity of the Y-BOCS-II and demonstrated that a cut-off of 13 (total score) attains the
best balance between sensitivity and specificity for the diagnosis of OCD. However, I
confirmed that its divergent validity is far from excellent. This last finding led me to
search for other potential markers of OCD.
Several abnormalities have been demonstrated in OCD patients in studies
using neuropsychological and neuroimaging approaches, but we still lack a consistent
marker for the disorder which is able to discriminate patients with OCD from healthy
subjects or from patients with other mental disorders, which is sensitive to treatmentinduced changes, and which can be mapped to brain circuits or function. An approach
which has been followed over the last decade is considering OCD as a disorder of
action learning systems of the brain. Sequential decision tasks have recently emerged
as an influential and sophisticated tool to investigate action learning in humans through
the reinforcement learning (RL) framework. According to the RL framework, actions
can be learned in two different ways: model-based control works by learning a model
of the dynamics of the environment and later using that model to plan future behavioral
trajectories, while model-free control works by storing the estimated value of recently
taken actions and updating these estimates by trial-and-error. Sequential decision
tasks have been used to assess associations between dysfunction in RL control
systems and certain behavioral disorders, such as OCD, where an unbalance between
model-based and model-free RL has been hypothesized. In fact, using the most
commonly applied sequential decision task, the two-step task, evidence has been
produced suggesting that OCD patients have a deficit in model-based learning.
However, in this specific paradigm, subjects typically receive detailed information
about task structure prior to performing the task. Thus, it remains unclear how different
RL systems contribute when subjects learn exclusively from experience, and how
explicit information about task structure modifies RL strategy. To address these
questions, I created a sequential decision task requiring minimal prior instruction, the
reduced two-step task. I assessed performance both prior to and after delivering
explicit information on task structure, in healthy volunteers, patients with OCD and
patients with other mood and anxiety disorders. Initially model-free control dominated,
with model-based control emerging only in a minority of subjects after significant task
experience, and not at all in patients with OCD, who had instead a tendency to
increase their use of model-free control. Once explicit information about task structure
was provided, a dramatic increase in the use of model-based RL was observed,similarly across healthy volunteers and both patient groups, including OCD. The
debriefing also significantly decreased the use of model-free RL in healthy volunteers
and in patients with mood and anxiety disorders, but not in OCD patients. Additionally,
after instructions, model-free action value updates were influenced more by state
values and less by trial outcomes, in all groups, and subject choices became more
perseverative in healthy subjects, consistent with changes in exploration strategy.
These results help in clarifying the RL profile for patients with OCD, with unspecific
findings of deficient model-based control, and more specific findings of enhanced
model-free control, in both cases prior to information about task structure.
Finally, as the literature is not yet consensual on how model-free and modelbased RL systems interact in human brain circuits, I developed a functional magnetic
resonance imaging (fMRI) protocol to assess uninstructed and instructed sequential
action choice. Preliminary results in healthy subjects suggest that the fMRI version of
the reduced two-step task allows to separate predominantly model-free control (before
instructions) from predominantly model-based control (after instructions), in the same
subject, task structure and environment. Across all sessions, choice events were
associated with increases blood-oxygen-level-dependent (BOLD) activity in the left
precentral gyrus and reward events were associated with increased BOLD activity in
the ventral striatum. I found that explicit knowledge about task structure modifies
blood-oxygen-level-dependent (BOLD) activity in the paracingulate cortex (medial
prefrontal cortex) during the transition from the first- to the second-step of the task.
Future directions include using multivariate pattern analysis techniques to explore how
the brain represents state space in sequential decision tasks and applying the current
fMRI protocol in clinical populations
Hierarchical models of goal-directed and automatic actions
Decision-making processes behind instrumental actions can be divided into two categories: goal-directed actions, and automatic actions. The structure of automatic actions, their interaction with goal-directed actions, and their behavioral and computational properties are the topics of the current thesis. We conceptualize the structure of automatic actions as sequences of actions that form a single response unit and are integrated within goal-directed processes in a hierarchical manner. We represent this hypothesis using the computational framework of reinforcement learning and develop a new normative computational model for the acquisition of action sequences, and their hierarchical interaction with goal-directed processes. We develop a neurally plausible hypothesis for the role of neuromodulator dopamine as a teaching signal for the acquisition of action sequences. We further explore the predictions of the proposed model in a two-stage decision-making task in humans and we show that the proposed model has higher explanatory power than its alternatives. Finally, we translate the two-stage decision-making task to an experimental protocol in rats and show that, similar to humans, rats also use action sequences and engage in hierarchical decision-making. The results provide a new theoretical and experimental paradigm for conceptualizing and measuring the operation and interaction of goal-directed and automatic actions
Hierarchical models of goal-directed and automatic actions
Decision-making processes behind instrumental actions can be divided into two categories: goal-directed actions, and automatic actions. The structure of automatic actions, their interaction with goal-directed actions, and their behavioral and computational properties are the topics of the current thesis. We conceptualize the structure of automatic actions as sequences of actions that form a single response unit and are integrated within goal-directed processes in a hierarchical manner. We represent this hypothesis using the computational framework of reinforcement learning and develop a new normative computational model for the acquisition of action sequences, and their hierarchical interaction with goal-directed processes. We develop a neurally plausible hypothesis for the role of neuromodulator dopamine as a teaching signal for the acquisition of action sequences. We further explore the predictions of the proposed model in a two-stage decision-making task in humans and we show that the proposed model has higher explanatory power than its alternatives. Finally, we translate the two-stage decision-making task to an experimental protocol in rats and show that, similar to humans, rats also use action sequences and engage in hierarchical decision-making. The results provide a new theoretical and experimental paradigm for conceptualizing and measuring the operation and interaction of goal-directed and automatic actions
- âŠ