175 research outputs found

    Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions

    Get PDF
    Reinforcement learning methods can be used in robotics applications especially for specific target-oriented problems, for example the reward-based recalibration of goal directed actions. To this end still relatively large and continuous state-action spaces need to be efficiently handled. The goal of this paper is, thus, to develop a novel, rather simple method which uses reinforcement learning with function approximation in conjunction with different reward-strategies for solving such problems. For the testing of our method, we use a four degree-of-freedom reaching problem in 3D-space simulated by a two-joint robot arm system with two DOF each. Function approximation is based on 4D, overlapping kernels (receptive fields) and the state-action space contains about 10,000 of these. Different types of reward structures are being compared, for example, reward-on- touching-only against reward-on-approach. Furthermore, forbidden joint configurations are punished. A continuous action space is used. In spite of a rather large number of states and the continuous action space these reward/punishment strategies allow the system to find a good solution usually within about 20 trials. The efficiency of our method demonstrated in this test scenario suggests that it might be possible to use it on a real robot for problems where mixed rewards can be defined in situations where other types of learning might be difficult

    Eye-Hand Coordination during Dynamic Visuomotor Rotations

    Get PDF
    Background for many technology-driven visuomotor tasks such as tele-surgery, human operators face situations in which the frames of reference for vision and action are misaligned and need to be compensated in order to perform the tasks with the necessary precision. The cognitive mechanisms for the selection of appropriate frames of reference are still not fully understood. This study investigated the effect of changing visual and kinesthetic frames of reference during wrist pointing, simulating activities typical for tele-operations. Methods using a robotic manipulandum, subjects had to perform center-out pointing movements to visual targets presented on a computer screen, by coordinating wrist flexion/extension with abduction/adduction. We compared movements in which the frames of reference were aligned (unperturbed condition) with movements performed under different combinations of visual/kinesthetic dynamic perturbations. The visual frame of reference was centered to the computer screen, while the kinesthetic frame was centered around the wrist joint. Both frames changed their orientation dynamically (angular velocity\u200a=\u200a36\ub0/s) with respect to the head-centered frame of reference (the eyes). Perturbations were either unimodal (visual or kinesthetic), or bimodal (visual+kinesthetic). As expected, pointing performance was best in the unperturbed condition. The spatial pointing error dramatically worsened during both unimodal and most bimodal conditions. However, in the bimodal condition, in which both disturbances were in phase, adaptation was very fast and kinematic performance indicators approached the values of the unperturbed condition. Conclusions this result suggests that subjects learned to exploit an \u201caffordance\u201d made available by the invariant phase relation between the visual and kinesthetic frames. It seems that after detecting such invariance, subjects used the kinesthetic input as an informative signal rather than a disturbance, in order to compensate the visual rotation without going through the lengthy process of building an internal adaptation model. Practical implications are discussed as regards the design of advanced, high-performance man-machine interfaces

    The Inactivation Principle: Mathematical Solutions Minimizing the Absolute Work and Biological Implications for the Planning of Arm Movements

    Get PDF
    An important question in the literature focusing on motor control is to determine which laws drive biological limb movements. This question has prompted numerous investigations analyzing arm movements in both humans and monkeys. Many theories assume that among all possible movements the one actually performed satisfies an optimality criterion. In the framework of optimal control theory, a first approach is to choose a cost function and test whether the proposed model fits with experimental data. A second approach (generally considered as the more difficult) is to infer the cost function from behavioral data. The cost proposed here includes a term called the absolute work of forces, reflecting the mechanical energy expenditure. Contrary to most investigations studying optimality principles of arm movements, this model has the particularity of using a cost function that is not smooth. First, a mathematical theory related to both direct and inverse optimal control approaches is presented. The first theoretical result is the Inactivation Principle, according to which minimizing a term similar to the absolute work implies simultaneous inactivation of agonistic and antagonistic muscles acting on a single joint, near the time of peak velocity. The second theoretical result is that, conversely, the presence of non-smoothness in the cost function is a necessary condition for the existence of such inactivation. Second, during an experimental study, participants were asked to perform fast vertical arm movements with one, two, and three degrees of freedom. Observed trajectories, velocity profiles, and final postures were accurately simulated by the model. In accordance, electromyographic signals showed brief simultaneous inactivation of opposing muscles during movements. Thus, assuming that human movements are optimal with respect to a certain integral cost, the minimization of an absolute-work-like cost is supported by experimental observations. Such types of optimality criteria may be applied to a large range of biological movements

    Attentive Learning of Sequential Handwriting Movements: A Neural Network Model

    Full text link
    Defense Advanced research Projects Agency and the Office of Naval Research (N00014-95-1-0409, N00014-92-J-1309); National Science Foundation (IRI-97-20333); National Institutes of Health (I-R29-DC02952-01)

    Evidence for Composite Cost Functions in Arm Movement Planning: An Inverse Optimal Control Approach

    Get PDF
    An important issue in motor control is understanding the basic principles underlying the accomplishment of natural movements. According to optimal control theory, the problem can be stated in these terms: what cost function do we optimize to coordinate the many more degrees of freedom than necessary to fulfill a specific motor goal? This question has not received a final answer yet, since what is optimized partly depends on the requirements of the task. Many cost functions were proposed in the past, and most of them were found to be in agreement with experimental data. Therefore, the actual principles on which the brain relies to achieve a certain motor behavior are still unclear. Existing results might suggest that movements are not the results of the minimization of single but rather of composite cost functions. In order to better clarify this last point, we consider an innovative experimental paradigm characterized by arm reaching with target redundancy. Within this framework, we make use of an inverse optimal control technique to automatically infer the (combination of) optimality criteria that best fit the experimental data. Results show that the subjects exhibited a consistent behavior during each experimental condition, even though the target point was not prescribed in advance. Inverse and direct optimal control together reveal that the average arm trajectories were best replicated when optimizing the combination of two cost functions, nominally a mix between the absolute work of torques and the integrated squared joint acceleration. Our results thus support the cost combination hypothesis and demonstrate that the recorded movements were closely linked to the combination of two complementary functions related to mechanical energy expenditure and joint-level smoothness

    Performance breakdown effects dissociate from error detection effects in typing

    Get PDF
    Mistakes in skilled performance are often observed to be slower than correct actions. This error slowing has been associated with cognitive control processes involved in performance monitoring and error detection. A limited literature on skilled actions, however, suggests that preerror actions may also be slower than accurate actions. This contrasts with findings from unskilled, discrete trial tasks, where preerror performance is usually faster than accurate performance. We tested 3 predictions about error-related behavioural changes in continuous typing performance. We asked participants to type 100 sentences without visual feedback. We found that (a) performance before errors was no different in speed than that before correct key-presses, (b) error and posterror key-presses were slower than matched correct key-presses, and (c) errors were preceded by greater variability in speed than were matched correct key-presses. Our results suggest that errors are preceded by a behavioural signature, which may indicate breakdown of fluid cognition, and that the effects of error detection on performance (error and posterror slowing) can be dissociated from breakdown effects (preerror increase in variability). © 2013 © 2013 The Experimental Psychology Society

    Synchrony of hand-foot coupled movements: is it attained by mutual feedback entrainment or by independent linkage of each limb to a common rhythm generator?

    Get PDF
    BACKGROUND: Synchrony of coupled oscillations of ipsilateral hand and foot may be achieved by controlling the interlimb phase difference through a crossed kinaesthetic feedback between the two limbs, or by an independent linkage of each limb cycle to a common clock signal. These alternative models may be experimentally challenged by comparing the behaviour of the two limbs when they oscillate following an external time giver, either alone or coupled together. RESULTS: Ten subjects oscillated their right hand and foot both alone and coupled (iso- or antidirectionally), paced by a metronome. Wrist and ankle angular position and Electromyograms (EMG) from the respective flexor and extensor muscles were recorded. Three phase delays were measured: i) the clk-mov delay, between the clock (metronome beat) and the oscillation peak; ii) the neur (neural) delay, between the clock and the motoneurone excitatory input, as inferred from the EMG onset; and iii) the mech (mechanical) delay between the EMG onset and the corresponding point of the limb oscillation. During uncoupled oscillations (0.4 Hz to 3.0 Hz), the mech delay increased from -7° to -111° (hand) and from -4° to -83° (foot). In contrast, the clk-mov delay remained constant and close to zero in either limb since a progressive advance of the motoneurone activation on the pacing beat (neur advance) compensated for the increasing mech delay. Adding an inertial load to either extremity induced a frequency dependent increase of the limb mechanical delay that could not be completely compensated by the increase of the neural phase advance, resulting in a frequency dependent increment of clk-mov delay of the hampered limb. When limb oscillations were iso- or antidirectionally coupled, either in the loaded or unloaded condition, the three delays did not significantly change with respect to values measured when limbs were moved separately. CONCLUSION: The absence of any significant effect of limb coupling on the measured delays suggests that during hand-foot oscillations, both iso- and antidirectionally coupled, each limb is synchronised to the common rhythm generator by a "private" position control, with no need for a crossed feedback interaction between limbs

    The Temporal Structure of Vertical Arm Movements

    Get PDF
    The present study investigates how the CNS deals with the omnipresent force of gravity during arm motor planning. Previous studies have reported direction-dependent kinematic differences in the vertical plane; notably, acceleration duration was greater during a downward than an upward arm movement. Although the analysis of acceleration and deceleration phases has permitted to explore the integration of gravity force, further investigation is necessary to conclude whether feedforward or feedback control processes are at the origin of this incorporation. We considered that a more detailed analysis of the temporal features of vertical arm movements could provide additional information about gravity force integration into the motor planning. Eight subjects performed single joint vertical arm movements (45° rotation around the shoulder joint) in two opposite directions (upwards and downwards) and at three different speeds (slow, natural and fast). We calculated different parameters of hand acceleration profiles: movement duration (MD), duration to peak acceleration (D PA), duration from peak acceleration to peak velocity (D PA-PV), duration from peak velocity to peak deceleration (D PV-PD), duration from peak deceleration to the movement end (D PD-End), acceleration duration (AD), deceleration duration (DD), peak acceleration (PA), peak velocity (PV), and peak deceleration (PD). While movement durations and amplitudes were similar for upward and downward movements, the temporal structure of acceleration profiles differed between the two directions. More specifically, subjects performed upward movements faster than downward movements; these direction-dependent asymmetries appeared early in the movement (i.e., before PA) and lasted until the moment of PD. Additionally, PA and PV were greater for upward than downward movements. Movement speed also changed the temporal structure of acceleration profiles. The effect of speed and direction on the form of acceleration profiles is consistent with the premise that the CNS optimises motor commands with respect to both gravitational and inertial constraints

    Comparison of the haptic and visual deviations in a parallelity task

    Get PDF
    Deviations in both haptic and visual spatial experiments are thought to be caused by a biasing influence of an egocentric reference frame. The strength of this influence is strongly participant-dependent. By using a parallelity test, it is studied whether this strength is modality-independent. In both haptic and visual conditions, large, systematic and participant-dependent deviations were found. However, although the correlation between the haptic and visual deviations was significant, the explained variance due to a common factor was only 20%. Therefore, the degree to which a participant is “egocentric” depends on modality and possibly even more generally, on experimental condition

    Fast and fine-tuned corrections when the target of a hand movement is displaced

    Get PDF
    To study the strategy in responding to target displacements during fast goal-directed arm movements, we examined how quickly corrections are initiated and how vigorously they are executed. We perturbed the target position at various moments before and after movement initiation. Corrections to perturbations before the movement started were initiated with the same latency as corrections to perturbations during the movement. Subjects also responded as quickly to a second perturbation during the same reach, even if the perturbations were only separated by 60 ms. The magnitude of the correction was minimized with respect to the time remaining until the end of the movement. We conclude that despite being executed after a fixed latency, these fast corrections are not stereotyped responses but are suited to the circumstances
    corecore