Disentangling the involvement of primary motor cortex in value-based reinforcement learning and value-based decision making.

Abstract

When one makes the decision to act in the physical world, the neural activity in primary motor cortex (M1) encodes the competition between potential action choices. Traditional approaches have viewed this activity as reflecting the unfolding of the outcome of a decision process taking place upstream. However, a recently emerging theoretical framework posits that the motor neural structures directly contribute to the decision process. We recently tested this hypothesis (Zenon et al., 2015, Brain Stimulation) by using continuous theta burst stimulation (cTBS) to alter activity in M1 while participants performed a task that required them to select between two fingers in the right hand based on the color of a stimulus (green or red, explicit instruction). Importantly, this finger choice was biased such that, to earn more money, the subjects also had to take into account the shape of the stimulus (circle or square, undisclosed manipulation). So the motor response depended, on the one hand, on a perceptual decision process, interpreting the color of the stimulus according to instructed rules and, on the other hand, on a value-based decision process relying on reinforcement learning. Interestingly, cTBS over M1 modified the extent to which the value-based process influenced the subjects' decisions whereas it had no effect on their ability to make a choice based on perceptual evidence. Importantly, in that study, cTBS was applied at the very beginning of the experiment, before the subjects had learned the task. Hence, we cannot tell from that work whether the effect of M1 cTBS was due to an alteration of value-based reinforcement learning or of value-based decision making, which takes place once learning is complete. Here, we present a study in which we intend to use the same task but with cTBS applied at different times in order to assess the contribution of M1 to the two value-based processes (learning and decision making). More precisely, the experiment will extend over three sessions, each occurring at 24-hours interval. Each experimental session will consist of six blocks, each lasting about 4 minutes. Pilot data suggest that the value-based process begins to effectively shape the subject decisions in the middle of the second session. Given this, cTBS over M1 will be applied either at the beginning of the first session (before learning) or at the beginning of the third session (after learning). This procedure will allow us to disentangle the involvement of M1 in value-based reinforcement learning and value-based decision making

    Similar works