3,141 research outputs found

    A New Data Source for Inverse Dynamics Learning

    Full text link
    Modern robotics is gravitating toward increasingly collaborative human robot interaction. Tools such as acceleration policies can naturally support the realization of reactive, adaptive, and compliant robots. These tools require us to model the system dynamics accurately -- a difficult task. The fundamental problem remains that simulation and reality diverge--we do not know how to accurately change a robot's state. Thus, recent research on improving inverse dynamics models has been focused on making use of machine learning techniques. Traditional learning techniques train on the actual realized accelerations, instead of the policy's desired accelerations, which is an indirect data source. Here we show how an additional training signal -- measured at the desired accelerations -- can be derived from a feedback control signal. This effectively creates a second data source for learning inverse dynamics models. Furthermore, we show how both the traditional and this new data source, can be used to train task-specific models of the inverse dynamics, when used independently or combined. We analyze the use of both data sources in simulation and demonstrate its effectiveness on a real-world robotic platform. We show that our system incrementally improves the learned inverse dynamics model, and when using both data sources combined converges more consistently and faster.Comment: IROS 201

    Adaptation and Learning for Manipulators and Machining

    Get PDF
    This thesis presents methods for improving the accuracy and efficiency of tasks performed using different kinds of industrial manipulators, with a focus on the application of machining. Industrial robots offer a flexible and cost-efficient alternative to machine tools for machining, but cannot achieve as high accuracy out of the box. This is mainly caused by non-ideal properties in the robot joints such as backlash and compliance, in combination with the strong process forces that affect the robot during machining operations. In this thesis, three different approaches to improving the robotic machining accuracy are presented. First, a macro/micro-manipulator approach is considered, where an external compensation mechanism is used in combination with the robot, for compensation of high-frequency Cartesian errors. Two different milling scenarios are evaluated, where a significant increase in accuracy was obtained. The accuracy specification of 50 ÎĽm was reached for both scenarios. Because of the limited workspace and the higher bandwidth of the compensation mechanism compared to the robot, two different mid-ranging approaches for control of the relative position between the robot and the compensator are developed and evaluated. Second, modeling and identification of robot joints is considered. The proposed method relies on clamping the manipulator end effector and actuating the joints, while measuring joint motor torque and motor position. The joint stiffness and backlash can subsequently be extracted from the measurements, to be used for compensation of the deflections that occur during machining. Third, a model-based iterative learning control (ILC) approach is proposed, where feedback is provided from three different sensors of varying investment costs. Using position measurements from an optical tracking system, an error decrease of up to 84 % was obtained. Measurements of end-effector forces yielded an error decrease of 55 %, and a force-estimation method based on joint motor torques decreased the error by 38 %. Further investigation of ILC methods is considered for a different kind of manipulator, a marine vibrator, for the application of marine seismic acquisition. A frequency-domain ILC strategy is proposed, in order to attenuate undesired overtones and improve the tracking accuracy. The harmonics were suppressed after approximately 20 iterations of the ILC algorithm, and the absolute tracking error was r educed by a factor of approximately 50. The final problem considered in this thesis concerns increasing the efficiency of machining tasks, by minimizing cycle times. A force-control approach is proposed to maximize the feed rate, and a learning algorithm for path planning of the machining path is employed for the case of machining in non-isotropic materials, such as wood. The cycle time was decreased by 14 % with the use of force control, and on average an additional 28 % decrease was achieved by use of a learning algorithm. Furthermore, by means of reinforcement learning, the path-planning algorithm is refined to provide optimal solutions and to incorporate an increased number of machining directions

    Model learning for trajectory tracking of robot manipulators

    Get PDF
    Abstract Model based controllers have drastically improved robot performance, increasing task accuracy while reducing control effort. Nevertheless, all this was realized with a very strong assumption: the exact knowledge of the physical properties of both the robot and the environment that surrounds it. This assertion is often misleading: in fact modern robots are modeled in a very approximate way and, more important, the environment is almost never static and completely known. Also for systems very simple, such as robot manipulators, these assumptions are still too strong and must be relaxed. Many methods were developed which, exploiting previous experiences, are able to refine the nominal model: from classic identification techniques to more modern machine learning based approaches. Indeed, the topic of this thesis is the investigation of these data driven techniques in the context of robot control for trajectory tracking. In the first two chapters, preliminary knowledge is provided on both model based controllers, used in robotics to assure precise trajectory tracking, and model learning techniques. In the following three chapters, are presented the novelties introduced by the author in this context with respect to the state of the art: three works with the same premise (an inaccurate system modeling), an identical goal (accurate trajectory tracking control) but with small differences according to the specific platform of application (fully actuated, underactuated, redundant robots). In all the considered architectures, an online learning scheme has been introduced to correct the nominal feedback linearization control law. Indeed, the method has been primarily introduced in the literature to cope with fully actuated systems, showing its efficacy in the accurate tracking of joint space trajectories also with an inaccurate dynamic model. The main novelty of the technique was the use of only kinematics information, instead of torque measurements (in general very noisy), to online retrieve and compensate the dynamic mismatches. After that the method has been extended to underactuated robots. This new architecture was composed by an online learning correction of the controller, acting on the actuated part of the system (the nominal partial feedback linearization), and an offline planning phase, required to realize a dynamically feasible trajectory also for the zero dynamics of the system. The scheme was iterative: after each trial, according to the collected information, both the phases were improved and then repeated until the task achievement. Also in this case the method showed its capability, both in numerical simulations and on real experiments on a robotics platform. Eventually the method has been applied to redundant systems: differently from before, in this context the task consisted in the accurate tracking of a Cartesian end effector trajectory. In principle very similar to the fully actuated case, the presence of redundancy slowed down drastically the learning machinery convergence, worsening the performance. In order to cope with this, a redundancy resolution was proposed that, exploiting an approximation of the learning algorithm (Gaussian process regression), allowed to locally maximize the information and so select the most convenient self motion for the system; moreover, all of this was realized with just the resolution of a quadratic programming problem. Also in this case the method showed its performance, realizing an accurate online tracking while reducing both the control effort and the joints velocity, obtaining so a natural behaviour. The thesis concludes with summary considerations on the proposed approach and with possible future directions of research

    Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics

    Get PDF
    The most data-efficient algorithms for reinforcement learning in robotics are model-based policy search algorithms, which alternate between learning a dynamical model of the robot and optimizing a policy to maximize the expected return given the model and its uncertainties. Among the few proposed approaches, the recently introduced Black-DROPS algorithm exploits a black-box optimization algorithm to achieve both high data-efficiency and good computation times when several cores are used; nevertheless, like all model-based policy search approaches, Black-DROPS does not scale to high dimensional state/action spaces. In this paper, we introduce a new model learning procedure in Black-DROPS that leverages parameterized black-box priors to (1) scale up to high-dimensional systems, and (2) be robust to large inaccuracies of the prior information. We demonstrate the effectiveness of our approach with the "pendubot" swing-up task in simulation and with a physical hexapod robot (48D state space, 18D action space) that has to walk forward as fast as possible. The results show that our new algorithm is more data-efficient than previous model-based policy search algorithms (with and without priors) and that it can allow a physical 6-legged robot to learn new gaits in only 16 to 30 seconds of interaction time.Comment: Accepted at ICRA 2018; 8 pages, 4 figures, 2 algorithms, 1 table; Video at https://youtu.be/HFkZkhGGzTo ; Spotlight ICRA presentation at https://youtu.be/_MZYDhfWeL

    CAD enabled trajectory optimization and accurate motion control for repetitive tasks

    Get PDF
    As machine users generally only define the start and end point of the movement, a large trajectory optimization potential rises for single axis mechanisms performing repetitive tasks. However, a descriptive mathematical model of the mecha- nism needs to be defined in order to apply existing optimization techniques. This is usually done with complex methods like virtual work or Lagrange equations. In this paper, a generic technique is presented to optimize the design of point-to-point trajectories by extracting position dependent properties with CAD motion simulations. The optimization problem is solved by a genetic algorithm. Nevertheless, the potential savings will only be achieved if the machine is capable of accurately following the optimized trajectory. Therefore, a feedforward motion controller is derived from the generic model allowing to use the controller for various settings and position profiles. Moreover, the theoretical savings are compared with experimental data from a physical set-up. The results quantitatively show that the savings potential is effectively achieved thanks to advanced torque feedforward with a reduction of the maximum torque by 12.6% compared with a standard 1/3-profil

    Sim-to-Real Learning of Robust Compliant Bipedal Locomotion on Torque Sensor-Less Gear-Driven Humanoid

    Full text link
    In deep reinforcement learning, sim-to-real is the mainstream method as it needs a large number of trials, however, it is challenging to transfer trained policy due to reality gap. In particular, it is known that the characteristics of actuators in leg robots have a considerable influence on the reality gap, and this is also noticeable in high reduction ratio gears. Therefore, we propose a new simulation model of high reduction ratio gears to reduce the reality gap. The instability of the bipedal locomotion causes the sim-to-real transfer to fail catastrophically, making system identification of the physical parameters of the simulation difficult. Thus, we also propose a system identification method that utilizes the failure experience. The realistic simulations obtained by these improvements allow the robot to perform compliant bipedal locomotion by reinforcement learning. The effectiveness of the method is verified using a actual biped robot, ROBOTIS-OP3, and the sim-to-real transferred policy archived to stabilize the robot under severe disturbances and walk on uneven terrain without force and torque sensors.Comment: 8 pages. An accompanying video is available at the following link: https://youtu.be/fZWQq9yAYe
    • …
    corecore