11 research outputs found

    Learning Sensor Feedback Models from Demonstrations via Phase-Modulated Neural Networks

    Full text link
    In order to robustly execute a task under environmental uncertainty, a robot needs to be able to reactively adapt to changes arising in its environment. The environment changes are usually reflected in deviation from expected sensory traces. These deviations in sensory traces can be used to drive the motion adaptation, and for this purpose, a feedback model is required. The feedback model maps the deviations in sensory traces to the motion plan adaptation. In this paper, we develop a general data-driven framework for learning a feedback model from demonstrations. We utilize a variant of a radial basis function network structure --with movement phases as kernel centers-- which can generally be applied to represent any feedback models for movement primitives. To demonstrate the effectiveness of our framework, we test it on the task of scraping on a tilt board. In this task, we are learning a reactive policy in the form of orientation adaptation, based on deviations of tactile sensor traces. As a proof of concept of our method, we provide evaluations on an anthropomorphic robot. A video demonstrating our approach and its results can be seen in https://youtu.be/7Dx5imy1KcwComment: 8 pages, accepted to be published at the International Conference on Robotics and Automation (ICRA) 201

    Planning surface cleaning tasks by learning uncertain drag actions outcomes

    Get PDF
    A method to perform cleaning tasks is presented where a robot manipulator autonomously grasps a textile and uses different dragging actions to clean a surface. Ac- tions are imprecise, and probabilistic planning is used to select the best sequence of actions. The character- ization of such actions is complex because the initial autonomous grasp of the textile introduces differences in the initial conditions that change the efficacy of the robot cleaning actions. We demonstrate that the action outcome probabilities can be learned very fast while the task is being executed, so as to progressively improve robot performance. The learner adds only a little over- head to the system compared to the improvements ob- tained. Experiments with a real robot show that the most effective plan varies depending on the initial grasp, and that plans become better after only a few learning itera- tionsPeer ReviewedPostprint (author’s final draft

    Development of lower-limb rehabilitation exercises using 3-PRS Parallel Robot and Dynamic Movement Primitives

    Full text link
    [EN] The design of rehabilitation exercises applied to sprained ankles requires extreme caution, regarding the trajectories and the speed of the movements that will affect the patient. This paper presents a technique that allows a 3-PRS parallel robot to control such exercises, consisting of dorsi/plantar flexion and inversion/eversion ankle movements. The work includes a position control scheme for the parallel robot in order to follow a reference trajectory for each limb with the possibility of stopping the exercise in mid-execution without control loss. This stop may be motivated by the forces that the robot applies to the patient, acting like an alarm mechanism. The procedure introduced here is based on Dynamic Movement Primitives (DMPs).This work has been partially funded by FEDER-CICYT project with reference DPI2017-84201-R financed by Ministerio de Economía, Industria e Innovación (Spain).Escarabajal Sánchez, RJ.; Abu Dakka, FJM.; Pulloquinga Zapata, J.; Mata Amela, V.; Vallés Miquel, M.; Valera Fernández, Á. (2020). Development of lower-limb rehabilitation exercises using 3-PRS Parallel Robot and Dynamic Movement Primitives. Multidisciplinary Journal for Education, Social and Technological Sciences. 7(2):30-44. https://doi.org/10.4995/muse.2020.13907OJS304472Abu-Dakka, F. J., Valera, A., Escalera, J. A., Vallés, M., Mata, V., & Abderrahim, M. (2015). Trajectory adaptation and learning for ankle rehabilitation using a 3-PRS parallel robot. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9245, 483-494. https://doi.org/10.1007/978-3-319-22876-1_41Atkeson, C. G., Moore, A. W., & Schaal, S. (1997). Locally Weighted Learning. Artificial Intelligence Review, 11(1-5), 11-73. https://doi.org/10.1007/978-94-017-2053-3_2Brockett, C. L., & Chapman, G. J. (2016). Biomechanics of the ankle. Orthopaedics and Trauma, 30(3), 232-238. https://doi.org/10.1016/j.mporth.2016.04.015Dai, J. S., Zhao, T., & Nester, C. (2004). Sprained Ankle Physiotherapy Based Mechanism Synthesis and Stiffness Analysis of a Robotic Rehabilitation Device. Autonomous Robots, 16(2), 207-218. https://doi.org/10.1023/B:AURO.0000016866.80026.d7Díaz-Rodríguez, M., Mata, V., Valera, Á., & Page, Á. (2010). A methodology for dynamic parameters identification of 3-DOF parallel robots in terms of relevant parameters. Mechanism and Machine Theory, 45(9), 1337-1356. https://doi.org/10.1016/j.mechmachtheory.2010.04.007Díaz, I., Gil, J. J., & Sánchez, E. (2011). Lower-Limb Robotic Rehabilitation: Literature Review and Challenges. Journal of Robotics, 2011(i), 1-11. https://doi.org/10.1155/2011/759764Fanger, Y., Umlauft, J., & Hirche, S. (2016). Gaussian Processes for Dynamic Movement Primitives with application in knowledge-based cooperation. IEEE International Conference on Intelligent Robots and Systems, 2016-Novem, 3913-3919. https://doi.org/10.1109/IROS.2016.7759576Gosselin, C., & Angeles, J. (1990). Singularity Analysis of Closed-Loop Kinematic Chains. IEEE Transactions on Robotics and Automation, 6(3), 281-290. https://doi.org/10.1109/70.56660Hesse, S., & Uhlenbrock, D. (2000). A mechanized gait trainer for restoration of gait. Journal of Rehabilitation Research and Development, 37(6), 701-708.Ijspeert, A. J., Nakanishi, J., Hoffmann, H., Pastor, P., & Schaal, S. (2013). Dynamical movement primitives: Learning attractor models formotor behaviors. Neural Computation, 25(2), 328-373. https://doi.org/10.1162/NECO_a_00393Ijspeert, A. J., Nakanishi, J., & Schaal, S. (2002). Movement imitation with nonlinear dynamical systems in humanoid robots. Proceedings - IEEE International Conference on Robotics and Automation, 2, 1398-1403. https://doi.org/10.1109/ROBOT.2002.1014739Liu, G., Gao, J., Yue, H., Zhang, X., & Lu, G. (2006). Design and kinematics simulation of parallel robots for ankle rehabilitation. 2006 IEEE International Conference on Mechatronics and Automation, ICMA 2006, 2006, 1109-1113. https://doi.org/10.1109/ICMA.2006.257780Nakanishi, J., Morimoto, J., Endo, G., Cheng, G., Schaal, S., & Kawato, M. (2004). Learning from demonstration and adaptation of biped locomotion. Robotics and Autonomous Systems, 47(2-3), 79-91. https://doi.org/10.1016/j.robot.2004.03.003Nemec, B., & Ude, A. (2012). Action sequencing using dynamic movement primitives. Robotica, 30(5), 837-846. https://doi.org/10.1017/S0263574711001056Patel, Y. D., & George, P. M. (2012). Parallel Manipulators Applications-A Survey. Modern Mechanical Engineering, 02(03), 57-64. https://doi.org/10.4236/mme.2012.23008Paul, R. P. (1981). Robot Manipulators: Mathematics, Programming, and Control : the Computer Control of Robot Manipulators (p. 279).Reinkensmeyer, D. J., Aoyagi, D., Emken, J. L., Galvez, J. A., Ichinose, W., Kerdanyan, G., Maneekobkunwong, S., Minakata, K., Nessler, J. A., Weber, R., Roy, R. R., De Leon, R., Bobrow, J. E., Harkema, S. J., & Reggie Edgerton, V. (2006). Tools for understanding and optimizing robotic gait training. Journal of Rehabilitation Research and Development, 43(5), 657-670. https://doi.org/10.1682/JRRD.2005.04.0073Safran, M. R., Benedetti, R. S., Bartolozzi, A. R., & Mandelbaum, B. R. (1999). Lateral ankle sprains: A comprehensive review part 1: Etiology, pathoanatomy, histopathogenesis, and diagnosis. In Medicine and Science in Sports and Exercise (Vol. 31, Issue 7 SUPPL., pp. S429-S437).https://doi.org/10.1097/00005768-199907001-00004Saglia, J. A., Tsagarakis, N. G., Dai, J. S., & Caldwell, D. G. (2013). Control strategies for patient-assisted training using the ankle rehabilitation robot (ARBOT). IEEE/ASME Transactions on Mechatronics, 18(6), 1799-1808. https://doi.org/10.1109/TMECH.2012.2214228Schaal, S. (2006). Dynamic Movement Primitives -A Framework for Motor Control in Humans and Humanoid Robotics. In Adaptive Motion of Animals and Machines (pp. 261-280). https://doi.org/10.1007/4-431-31381-8_23Sui, P., Yao, L., Lin, Z., Yan, H., & Dai, J. S. (2009). Analysis and synthesis of ankle motion and rehabilitation robots. 2009 IEEE International Conference on Robotics and Biomimetics, ROBIO 2009, 3, 2533-2538. https://doi.org/10.1109/ROBIO.2009.5420487Tsoi, Y. H., Xie, S. Q., & Graham, A. E. (2009). Design, modeling and control of an ankle rehabilitation robot. Studies in Computational Intelligence, 177, 377-399. https://doi.org/10.1007/978-3-540-89933-4_18Vallés, M., Díaz-Rodrguez, M., Valera, Á., Mata, V., & Page, Á. (2012). Mechatronic development and dynamic control of a 3-dof parallel manipulator. Mechanics Based Design of Structures and Machines, 40(4), 434-452. https://doi.org/10.1080/15397734.2012.687292Xie, S. (2016). Advanced robotics for medical rehabilitation: current state of the art and recent advances. In Springer tracts in advanced robotics (Issue 108). https://doi.org/10.1007/978-3-319-19896-5Yoon, J., Ryu, J., & Lim, K. B. (2006). Reconfigurable ankle rehabilitation robot for various exercises. Journal of Robotic Systems, 22(SUPPL.), 15-33. https://doi.org/10.1002/rob.2015

    Planning robot manipulation to clean planar surfaces

    Get PDF
    This paper presents a new approach to plan high-level manipulation actions for cleaning surfaces in household environments, like removing dirt from a table using a rag. Dragging actions can change the distribution of dirt in an unpredictable manner, and thus the planning becomes challenging. We propose to define the problem using explicitly uncertain actions, and then plan the most effective sequence of actions in terms of time. However, some issues have to be tackled to plan efficiently with stochastic actions. States become hard to predict after executing a few actions, so replanning every few actions with newer perceptions gives the best results, and the trade-off between planning time and plan quality is also important. Finally a learner is integrated to provide adaptation to changes, such as different rag grasps, robots, or cleaning surfaces. We demonstrate experimentally, using two different robot platforms, that planning is more advantageous than simple reactive strategies for accomplishing complex tasks, while still providing a similar performance for easy tasks. We also performed experiments where the rag grasp was changed, and thus the behaviour of the dragging actions, showing that the learning capabilities allow the robot to double its performance with a new rag grasp after a few cleaning iterations.This work was supported by EU Project IntellActFP7-ICT2009-6-269959 and by the Spanish Ministry of Science and Innovation under project PAU+ DPI2011-27510. D. Martínez is also supported by the Spanish Ministry of Education, Culture and Sport via a FPU doctoral grant (FPU12-04173).Peer Reviewe

    Towards robots with teleological action and language understanding

    Get PDF
    International audienceIt is generally agreed upon that in order to achieve generalizable learning capabilities of robots they need to be able to acquire compositional structures - whether in language or in action. However, in human development the capability to perceive compositional structure only evolves at a later stage. Before the capability to understand action and language in a structured, compositional way arises, infants learn in a holistic way which enables them to interact in a socially adequate way with their social and physical environment even with very limited understanding of the world, e.g. trying to take part in games without knowing the exact rules. This capability endows them with an action production advantage which elicits corrective feedback from a tutor, thus reducing the search space of possible action interpretations tremendously. In accordance with findings from developmental psychology we argue that this holistic way is in fact a teleological representation encoding a goal-directed per- ception of actions facilitated through communicational frames. This observation leads to a range of consequences which need to be verfied and analysed in further research. Here, we discuss two hypotheses how this can be made accessible for action learning in robots: (1) We explore the idea that the teleological approach allows some kind of highly reduced one shot learning enabling the learner to perform a meaningful, although only partially "correct" action which can then be further refined through compositional approaches. (2) We discuss the possibility to transfer the concept of "conversational frames" as recurring interaction patterns to the action domain, thus facilitating to understand the meaning of a new action. We conclude that these capabilities need to be combined with more analytical compositional learning methods in order to achieve human-like learning performance

    Supervised Learning and Reinforcement Learning of Feedback Models for Reactive Behaviors: Tactile Feedback Testbed

    Full text link
    Robots need to be able to adapt to unexpected changes in the environment such that they can autonomously succeed in their tasks. However, hand-designing feedback models for adaptation is tedious, if at all possible, making data-driven methods a promising alternative. In this paper we introduce a full framework for learning feedback models for reactive motion planning. Our pipeline starts by segmenting demonstrations of a complete task into motion primitives via a semi-automated segmentation algorithm. Then, given additional demonstrations of successful adaptation behaviors, we learn initial feedback models through learning from demonstrations. In the final phase, a sample-efficient reinforcement learning algorithm fine-tunes these feedback models for novel task settings through few real system interactions. We evaluate our approach on a real anthropomorphic robot in learning a tactile feedback task.Comment: Submitted to the International Journal of Robotics Research. Paper length is 21 pages (including references) with 12 figures. A video overview of the reinforcement learning experiment on the real robot can be seen at https://www.youtube.com/watch?v=WDq1rcupVM0. arXiv admin note: text overlap with arXiv:1710.0855

    IntelligentAutonomous SystemsLearningSequential SkillsforRobot Manipulation Tasks

    Get PDF
    denangegebenenQuellenundHilfsmittelnangefertigtzuhaben. AlleStellen,di

    Coupling Movement Primitives: Interaction With the Environment and Bimanual Tasks

    Get PDF
    The framework of dynamic movement primitives (DMPs) contains many favorable properties for the execution of robotic trajectories, such as indirect dependence on time, response to perturbations, and the ability to easily modulate the given trajectories, but the framework in its original form remains constrained to the kinematic aspect of the movement. In this paper, we bridge the gap to dynamic behavior by extending the framework with force/torque feedback. We propose and evaluate a modulation approach that allows interaction with objects and the environment. Through the proposed coupling of originally independent robotic trajectories, the approach also enables the execution of bimanual and tightly coupled cooperative tasks. We apply an iterative learning control algorithm to learn a coupling term, which is applied to the original trajectory in a feed-forward fashion and, thus, modifies the trajectory in accordance to the desired positions or external forces. A stability analysis and results of simulated and real-world experiments using two KUKA LWR arms for bimanual tasks and interaction with the environment are presented. By expanding on the framework of DMPs, we keep all the favorable properties, which is demonstrated with temporal modulation and in a two-agent obstacle avoidance task

    Learning relational models with human interaction for planning in robotics

    Get PDF
    Automated planning has proven to be useful to solve problems where an agent has to maximize a reward function by executing actions. As planners have been improved to salve more expressive and difficult problems, there is an increasing interest in using planning to improve efficiency in robotic tasks. However, planners rely on a domain model, which has to be either handcrafted or learned. Although learning domain models can be very costly, recent approaches provide generalization capabilities and integrate human feedback to reduce the amount of experiences required to learn. In this thesis we propase new methods that allow an agent with no previous knowledge to solve certain problems more efficiently by using task planning. First, we show how to apply probabilistic planning to improve robot performance in manipulation tasks (such as cleaning the dirt or clearing the tableware on a table). Planners obtain sequences of actions that get the best result in the long term, beating reactive strategies. Second, we introduce new reinforcement learning algorithms where the agent can actively request demonstrations from a teacher to learn new actions and speed up the learning process. In particular, we propase an algorithm that allows the user to set the mínimum quality to be achieved, where a better quality also implies that a larger number of demonstrations will be requested . Moreover, the learned model is analyzed to extract the unlearned or problematic parts of the model. This information allow the agent to provide guidance to the teacher when a demonstration is requested, and to avoid irrecoverable errors. Finally, a new domain model learner is introduced that, in addition to relational probabilistic action models, can also learn exogenous effects. This learner can be integrated with existing planners and reinforcement learning algorithms to salve a wide range of problems. In summary, we improve the use of learning and task planning to salve unknown tasks. The improvements allow an agent to obtain a larger benefit from planners, learn faster, balance the number of action executions and teacher demonstrations, avoid irrecoverable errors, interact with a teacher to solve difficult problems, and adapt to the behavior of other agents by learning their dynamics. All the proposed methods were compared with state-of-the-art approaches, and were also demonstrated in different scenarios, including challenging robotic tasks.La planificación automática ha probado ser de gran utilidad para resolver problemas en los que un agente tiene que ejecutar acciones para maximizar una función de recompensa. A medida que los planificadores han sido capaces de resolver problemas cada vez más complejos, ha habido un creciente interés por utilizar dichos planificadores para mejorar la eficiencia de tareas robóticas. Sin embargo, los planificadores requieren un modelo del dominio, el cual puede ser creado a mano o aprendido. Aunque aprender modelos automáticamente puede ser costoso, recientemente han aparecido métodos que permiten la interacción persona-máquina y generalizan el conocimiento para reducir la cantidad de experiencias requeridas para aprender. En esta tesis proponemos nuevos métodos que permiten a un agente sin conocimiento previo de la tarea resolver problemas de forma más eficiente mediante el uso de planificación automática. Comenzaremos mostrando cómo aplicar planificación probabilística para mejorar la eficiencia de robots en tareas de manipulación (como limpiar suciedad o recoger una mesa). Los planificadores son capaces de obtener las secuencias de acciones que producen los mejores resultados a largo plazo, superando a las estrategias reactivas. Por otro lado, presentamos nuevos algoritmos de aprendizaje por refuerzo en los que el agente puede solicitar demostraciones a un profesor. Dichas demostraciones permiten al agente acelerar el aprendizaje o aprender nuevas acciones. En particular, proponemos un algoritmo que permite al usuario establecer la mínima suma de recompensas que es aceptable obtener, donde una recompensa más alta implica que se requerirán más demostraciones. Además, el modelo aprendido será analizado para identificar qué partes están incompletas o son problemáticas. Esta información permitirá al agente evitar errores irrecuperables y también guiar al profesor cuando se solicite una demostración. Finalmente, se ha introducido un nuevo método de aprendizaje para modelos de dominios que, además de obtener modelos relacionales de acciones probabilísticas, también puede aprender efectos exógenos. Mostraremos cómo integrar este método en algoritmos de aprendizaje por refuerzo para poder abordar una mayor cantidad de problemas. En resumen, hemos mejorado el uso de técnicas de aprendizaje y planificación para resolver tareas desconocidas a priori. Estas mejoras permiten a un agente aprovechar mejor los planificadores, aprender más rápido, elegir entre reducir el número de acciones ejecutadas o el número de demostraciones solicitadas, evitar errores irrecuperables, interactuar con un profesor para resolver problemas complejos, y adaptarse al comportamiento de otros agentes aprendiendo sus dinámicas. Todos los métodos propuestos han sido comparados con trabajos del estado del arte, y han sido evaluados en distintos escenarios, incluyendo tareas robóticas
    corecore