15 research outputs found

    Rapid reaching movements in tasks with competing targets.

    No full text
    <p>Top row illustrates experimental results in rapid reaching tasks with multiple potential targets [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004402#pcbi.1004402.ref012" target="_blank">12</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004402#pcbi.1004402.ref022" target="_blank">22</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004402#pcbi.1004402.ref023" target="_blank">23</a>] (images are reproduced with permission of the authors). When the target position is known prior to movement onset, reaches are made directly to that target (black and green traces in <b>A</b>), otherwise, reaches aim to an intermediate location, before correcting in-flight to the cued target (red and blue traces in <b>A</b>). The competition between the two reaching policies that results in spatial averaging movements, is biased by the spatial distribution of the targets (<b>B</b>), by recent trial history (<b>C</b>) and the number of targets presented in each visual field (<b>D</b>). The bottom row (<b>E-H</b>) illustrates the simulated reaching movements generated in tasks with multiple potential targets. Each bottom panel corresponds to the reaching condition described on the top panels.</p

    Sequential movements.

    No full text
    <p><b>A</b>: Examples of simulated trajectories for continuously copying a pentagon. <b>B</b>: Time course of the relative desirability values of the 5 individual policies (i.e., 5 segments) in a successful trial for copying a pentagon. The line colors correspond to the segments of the pentagon as shown in the top panel. The shape was copied counterclockwise (as indicated by the arrow) starting from the gray vertex. Each of the horizontal discontinuous lines indicate the completion time of copying the current segment. Notice that the desirability of the current segment peaks immediately after the start of drawing that segment and falls down gradually, whereas the desirability of the following segment starts rising while copying the current segment. Because of that, the consecutive segments compete for action selection frequently producing error trials, as illustrated in panel <b>C</b>. Finally, the panels (<b>D</b>) and (<b>E</b>) depict examples of simulated trajectories for continuously copying an equilateral triangle and a square, respectively, counterclockwise starting from the bottom right vertex.</p

    Saccadic movements in tasks with competing targets.

    No full text
    <p><b>A</b>: Simulated saccadic movements for pair of targets with 30° (gray traces) and 90° (black traces) target separation. <b>B</b>: A method followed to visualize the relative desirability function of two competing saccadic policies (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004402#sec006" target="_blank">results</a> section for more details). <b>C</b>: Heat map of the relative desirability function at different states to saccade to the left target, at a 30° target separation. Red and blue regions corresponds to high and low desirability states, respectively. Black traces correspond to averaged trajectories in single-target trials. Notice the strong competition between the two saccadic policies (greenish areas). <b>D</b>: Similar to panel <i>C</i>, but for 90° target separation. In this case, targets are located in areas with no competition between the two policies (red and blue regions). <b>E</b>: Examples of saccadic movements (left column) with the corresponding time course of the relative desirability of the two policies (right column). The first two rows illustrate characteristic examples from 30° target separation, in which competition results primarily in saccade averaging (top panel) and less frequently in correct movements (middle panel). The bottom row shows a characteristic example from 90° target separation, in which the competition is resolved almost immediately after saccadic onset, producing almost no errors. <b>F</b>: Percentage of simulated averaging saccades for different degrees of target separation (red line)—green, blue and cyan lines describe the percentage of averaging saccades performed by 3 monkeys [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004402#pcbi.1004402.ref024" target="_blank">24</a>].</p

    Encoding the order of policies in sequential movements.

    No full text
    <p><b>A</b>: Probability distribution of time to arrive at vertex <i>j</i> starting from the original state at time <i>t</i> = 0 and visiting all the precedent vertices. Each color codes the segments and the vertices of the pentagon as shown in the right inset. The pentagon is copied counterclockwise (as indicated by the arrow) starting from the purple vertex at <i>t</i> = 0. The gray trajectories illustrate examples from the 100 reaches generated to estimate the probability distribution of time to arrive at vertex <i>k</i> given that we started from vertex <i>k</i> − 1, <math><mrow><mi>P</mi><mo>(</mo><msubsup><mi>τ</mi><mrow><mi>a</mi><mi>r</mi><mi>r</mi><mi>i</mi><mi>v</mi><mi>e</mi></mrow><mi>k</mi></msubsup><mo>|</mo><msubsup><mi>τ</mi><mrow><mi>a</mi><mi>r</mi><mi>r</mi><mi>i</mi><mi>v</mi><mi>e</mi></mrow><mrow><mi>k</mi><mo>-</mo><mn>1</mn></mrow></msubsup><mo>)</mo></mrow></math>. <b>B</b>: Probability distribution <i>P</i>(<i>vertex</i> = <i>j</i>|<b>x</b><sub><i>t</i></sub>), which describes the probability to copy the segment defined by the two successive vertices <i>j</i> − 1 and <i>j</i> at state <b>x</b><sub><i>t</i></sub>. This probability distribution is estimated at time <i>t</i> = 0 and when arriving at the next vertex, we condition on completion, and <i>P</i>(<i>vertex</i> = <i>j</i>|<b>x</b><sub><i>t</i></sub>) is re-evaluated for the next vertices.</p

    The architectural organization of the theory.

    No full text
    <p>It consists of multiple stochastic optimal control schemes where each of them is attached to a particular goal presented currently in the field. We illustrate the architecture of the theory using the hypothetical scenario of the soccer game, in which the player who is possessing the ball is presented with 3 alternative options—i.e., 3 teammates—located at different distances from the current state <b>x</b><sub><i>t</i></sub>. In such a situation, the control schemes related to these options are triggered and generate 3 action plans (<b>u</b><sub>1</sub> = <i>π</i><sub>1</sub>(<b>x</b><sub><i>t</i></sub>), <b>u</b><sub>2</sub> = <i>π</i><sub>2</sub>(<b>x</b><sub><i>t</i></sub>) and <b>u</b><sub>3</sub> = <i>π</i><sub>3</sub>(<b>x</b><sub><i>t</i></sub>)) to pursue each of the individual options. At each time <i>t</i>, desirabilities of the each policy in terms of action cost and good value are computed separately, then combined into an overall desirability. The action cost of each policy is the cost-to-go of the remaining actions that would occur if the policy were followed from the current state <b>x</b><sub><i>t</i></sub> to the target. These action costs are converted into a relative desirability that characterizes the probability that implementing this policy will have the lowest cost relative to the alternative policies. Similarly, the good value attached to each policy is evaluated in the goods-space and is converted into a relative desirability that characterizes the probability that implementing that policy (i.e., select the goal <i>i</i>) will result in highest reward compare to the alternative options, from the current state <b>x</b><sub><i>t</i></sub>. These two desirabilities are combined to give what we call “relative-desirability” value, which reflects the degree to which the individual policy <i>π</i><sub><i>i</i></sub> is desirable to follow, at the given time and state, with respect to the other available policies. The overall policy that the player follows is a time-varying weighted mixture of the individual policies using the desirability value as weighted factor. Because relative desirability is time- and state- dependent, the weighted mixture of policies produces a range of behavior from “winner-take-all” (i.e., pass the ball) to “spatial averaging” (i.e., keep the ball and delay your decision).</p

    Characteristic example of the simulated model activity during an effector choice task with three targets.

    No full text
    <p>Neuronal activity of the DNFs that plan saccade (upper row) and reaching (bottom row) movements during a “cued-saccade” trial (note the red cue), in which the context cue is presented prior to target onset. The competition between the effectors is resolved shortly after the context cue is presented. Once the locations of the targets are shown, the framework has already selected the effector (i.e., eye in this trial) and the competition between the targets is resolved quite fast resulting in direct saccadic movements to the selected target (right panel).</p

    Simulated neural activity, movement time and approach direction of reaching trajectories, in free-choice and cued-reaching trials in an effector choice task.

    No full text
    <p><b>A</b>: Time course of the average activity (20 trials) of the two populations of neurons tuned to the selected (solid black line) and the non-selected (discontinuous black line) targets prior to movement onset, from the DNFs that plan reaching (green color) and saccade (red color) movements in the “free-choice” sessions. Data shown only when reaches were selected. Notice that the framework selects first which effector to use to perform the task and then it chooses the target. The average activity from the saccade DNF for the selected and non-selected targets overlaps. <b>B</b>: Similar to panel A, but for the “cued-reaching” sessions. The competition between the effectors is resolved almost immediately after the cue onset. <b>C</b>: Mean movement (i.e., response) time from 20 reaching trajectories in a free-choice task (i.e., model is free to choose to perform hand or eye movement to acquire each of the targets) and a cued task, in which the model was instructed to perform reaches. The error bars are ± standard error. The movement time in the free-choice trials was significantly lower than the movement time in the cued-reaching trials (two-sample <i>t</i>-test, <i>p</i> < 10<sup>−7</sup>). <b>D</b>: Mean approach direction of 20 reaching movements for the first 50 time-steps in a free-choice task and a cued-reaching task. The error bars are ± standard error. Approach direction at 0 deg. indicates that initial reaching movements were made towards the intermediate location between the two targets. Notice that free-choice trials are characterized with straight reaching movements to the selected target, whereas the cued-reaching trials are dominated mostly by curved reaching movements to the selected target (two-sample <i>t</i>-test, <i>p</i> < 10<sup>−4</sup>).</p

    Model architecture.

    No full text
    <p>The core component of the model is the motor plan formation field that dynamically integrates information from disparate sources. It receives excitatory inputs (green lines) from: i) the spatial sensory input field that encodes the angular representation of the alternative goals, ii) the goods-value field that encodes the expected benefits for moving towards a particular direction and iii) the context cue field that represents information related to the contextual requirements of the task. The motor plan formation field also receives inhibitory inputs (red line) from the action cost field that encodes the action cost (e.g., effort) required to move in a particular direction. All this information is integrated by the motor plan formation field into an evolving assessment of the “desirability” of the alternative options. Each neuron in the motor plan formation field is linked with a motor control schema that generates a direction-specific policy <i>π</i><sub><i>j</i></sub> to move in the preferred direction of that neuron. The output activity of the motor plan formation field weights the influence of each individual policy on the final action-plan (see “Model architecture” in the results section for more details).</p

    History of training on effector cues.

    No full text
    <p>Plots A-D show the connection weights from neurons representing each cue (i.e., red and green) to the saccade (A-B) and reach (C-D) motor plan formation DNFs. There are 50 neurons selective for each cue and each motor plan formation field has 181 neurons, yielding four 50×181 connection weight matrices. Each matrix has been averaged over the cue selective neurons at each trial to show the mean connection weight to each motor plan formation field as training progresses. <b>A</b>: Mean connection weights from neurons representing the red cue (cue 1) to neurons in the saccade motor formation DNF from trials 1 to 500. <b>B</b>: Mean connection weights from green cue (cue 2) neurons to the saccade DNF. <b>C</b>: Mean connection weights from red cue neurons to the reach motor formation DNF. <b>D</b>: Mean connection weights from green cue neurons to the reach motor formation DNF. <b>E</b>: Success of each trial during training (0 = unsuccessful, 1 = successful).</p

    History of training on reward contingency.

    No full text
    <p><b>A</b>: Expected reward for target directions in an egocentric reference frame from trials 1 to 500. The model was presented with two targets on each trial, initialized with equal expected reward. Reward was received for reaching or making a saccade to the left target. <b>B</b>: Success of each trial during training (0 = unsuccessful, 1 = successful).</p
    corecore