4 research outputs found

    Learning Utility Surfaces for Movement Selection

    Get PDF
    Humanoid robots are highly redundant systems with respect to the tasks they are asked to perform. This redundancy manifests itself in the number of degrees of freedom of the robot exceeding the dimensionality of the task. Traditionally this redundancy has been utilised through optimal control in the null-space. Some cost function is defined that encodes secondary movement goals and movements are optimised with respect to this functio

    Controlling and Learning Constrained Motions for Manipulation in Contact

    Get PDF
    Many practical tasks in robotic systems involving contact interaction with the environment, such as cleaning windows, writing or grasping, are inherently constrained, in that both the task and the environment impose constraints on the robot’s motion. While constraints from manipulation motions in contact represent a challenge when modelling and controlling such robotic systems, they might also be an opportunity, if exploited for decomposing complex controllers into simpler ones that are easier to design, implement, test and even learn from data. Modelling such systems requires incorporating these constraints in the robot’s dynamic model. In this thesis, I define the class of Task-based Constraints (TbCs) and prove that the forward dynamic models of a constrained system obtained through the Projected Dynamics (PD) and the Operational Space Formulation (OSF) are equivalent. Establishing such equivalence required: reformulating the PD constraint inertia matrix, generalising all its previous distinct algebraic variations; and generalising the OSF to rank deficient constraint Jacobian matrices. This generalization allows us to numerically handle redundant constraints and singular configurations, without having to use different controllers in the vicinity of such configurations. Furthermore, I show that we can recover both operational space control with constraints and the hybrid position/force control in the operational space from a multiple Task-based Constraint abstraction. I then propose a control and trajectory tracking approach for wiping the train cab front panels, using a velocity controlled robotic manipulator and a force/torque sensor attached to its end-effector, without using any surface model or vision-based surface detection. The control strategy consists of a hybrid position/force controller, adapted from the Operational Space Formulation, that aligns the cleaning tool with the surface normal, maintaining a set- point normal force, while simultaneously moving along the surface. The trajectory tracking strategy consists of specifying and tracking a two dimensional path that, when projected onto the train surface, corresponds to the desired pattern of motion. An experiment with the Baxter robot to wipe a highly curved surface with both a spiral and a raster scan motion patterns validates the approach. I also implemented the same approach in a scaled robot prototype, specifically designed to wipe a 1/8 scaled version of a train cab front, using a raster scan pattern. Learning these type of control policies subject to constraints is a challenging problem. This thesis proposes a Constraint-aware Policy Learning (CaPL) method that solves the policy learning problem on redundant robots which execute a policy acting in the null-space of a constraint. This learning approach allows the generalization of learnt control policies across constraints that are unknown during the training phase. The CaPL method splits the combined problem of learning constraints and policies into: first estimating the constraint, and then estimating an unconstrained policy using the remaining degrees of freedom. For a linear parametrization, there is a closed-form solution for the problem of estimating constraints based on Singular Value Decomposition (SVD). In this thesis, I propose another closed-form solution for constraint estimation for the TbC case, which includes estimating the task component without affecting the norm of the constraint matrix, based on Generalized Singular Value Decomposition (GSVD). I also discuss a metric for comparing the similarity of estimated constraints, which is useful to pre-process the trajectories recorded in the demonstrations. An experiment consisting in: learning a wiping task from human demonstration on flat surfaces; and reproducing it on an unknown curved surface using a force/torque based controller, to achieve tool alignment, validates the CaPL method. Despite the differences between the training and validation scenarios, the learnt policy still provides the desired wiping motion

    Controlling and learning constrained motions for manipulation in contact

    Get PDF
    Many practical tasks in robotic systems involving contact interaction with the environment, such as cleaning windows, writing or grasping, are inherently constrained, in that both the task and the environment impose constraints on the robot’s motion. While constraints from manipulation motions in contact represent a challenge when modelling and controlling such robotic systems, they might also be an opportunity, if exploited for decomposing complex controllers into simpler ones that are easier to design, implement, test and even learn from data. Modelling such systems requires incorporating these constraints in the robot’s dynamic model. In this thesis, I define the class of Task-based Constraints (TbCs) and prove that the forward dynamic models of a constrained system obtained through the Projected Dynamics (PD) and the Operational Space Formulation (OSF) are equivalent. Establishing such equivalence required: reformulating the PD constraint inertia matrix, generalizing all its previous distinct algebraic variations; and generalizing the OSF to rank deficient constraint Jacobian matrices. This generalization allows us to numerically handle redundant constraints and singular configurations, without having to use different controllers in the vicinity of such configurations. Furthermore, I show that we can recover both operational space control with constraints and the hybrid position/force control in the operational space from a multiple Task-based Constraint abstraction. I then propose a control and trajectory tracking approach for wiping the train cab front panels, using a velocity controlled robotic manipulator and a force/torque sensor attached to its end-effector, without using any surface model or vision-based surface detection. The control strategy consists of a hybrid position/force controller, adapted from the Operational Space Formulation, that aligns the cleaning tool with the surface normal, maintaining a setpoint normal force, while simultaneously moving along the surface. The trajectory tracking strategy consists of specifying and tracking a two dimensional path that, when projected onto the train surface, corresponds to the desired pattern of motion. An experiment with the Baxter robot to wipe a highly curved surface with both a spiral and a raster scan motion patterns validates the approach. I also implemented the same approach in a scaled robot prototype, specifically designed to wipe a 1/8 scaled version of a train cab front, using a raster scan pattern. Learning these type of control policies subject to constraints is a challenging problem. This thesis proposes a Constraint-aware Policy Learning (CaPL) method that solves the policy learning problem on redundant robots which execute a policy acting in the null-space of a constraint. This learning approach allows the generalization of learnt control policies across constraints that are unknown during the training phase. The CaPL method splits the combined problem of learning constraints and policies into: first estimating the constraint, and then estimating an unconstrained policy using the remaining degrees of freedom. For a linear parametrization, there is a closed-form solution for the problem of estimating constraints based on Singular Value Decomposition (SVD). In this thesis, I propose another closed-form solution for constraint estimation for the TbC case, which includes estimating the task component without affecting the norm of the constraint matrix, based on Generalized Singular Value Decomposition (GSVD). I also discuss a metric for comparing the similarity of estimated constraints, which is useful to pre-process the trajectories recorded in the demonstrations. An experiment consisting in: learning a wiping task from human demonstration on flat surfaces; and reproducing it on an unknown curved surface using a force/torque based controller, to achieve tool alignment, validates the CaPL method. Despite the differences between the training and validation scenarios, the learnt policy still provides the desired wiping motion.James-Watt Scholarshi

    Learning Utility Surfaces for Movement Selection

    No full text
    Abstract — Humanoid robots are highly redundant systems with respect to the tasks they are asked to perform. This redun-dancy manifests itself in the number of degrees of freedom of the robot exceeding the dimensionality of the task. Traditionally this redundancy has been utilised through optimal control in the null-space. Some cost function is defined that encodes secondary movement goals and movements are optimised with respect to this function, subject to fulfilment of task constraints. Until now design of cost functions has been carried out on an ad-hoc basis and has required time-consuming hand-tuning to ensure that the desired (or acceptable) behaviour is realised. Here we present a novel approach for designing cost functions for optimal control in the null-space by exploiting recent advances in statistical machine learning. The behaviour of a (kinematically or dynam-ically controlled) mechanical system performing some task is observed and separated into task- and null-space components. The null-space component is then modelled as a first order differential equation with the cost as the independent variable. Numerical solution of this equation provides training data for a statistical learning algorithm that is used to build an open-form model of the cost function. Results are presented in which the reconstructed function is used to replace that of the original control scheme and the resultant behaviour, for the same set of tasks, is compared
    corecore