102 research outputs found

    Reinforcement Learning Experiments and Benchmark for Solving Robotic Reaching Tasks

    Full text link
    Reinforcement learning has shown great promise in robotics thanks to its ability to develop efficient robotic control procedures through self-training. In particular, reinforcement learning has been successfully applied to solving the reaching task with robotic arms. In this paper, we define a robust, reproducible and systematic experimental procedure to compare the performance of various model-free algorithms at solving this task. The policies are trained in simulation and are then transferred to a physical robotic manipulator. It is shown that augmenting the reward signal with the Hindsight Experience Replay exploration technique increases the average return of off-policy agents between 7 and 9 folds when the target position is initialised randomly at the beginning of each episode

    Deep Reinforcement Learning for Robotic Tasks: Manipulation and Sensor Odometry

    Get PDF
    Research in robotics has frequently focused on artificial intelligence (AI). To increase the effectiveness of the learning process for the robot, numerous studies have been carried out. To be more effective, robots must be able to learn effectively in a shorter amount of time and with fewer resources. It has been established that reinforcement learning (RL) is efficient for aiding a robot's learning. In this dissertation, we proposed and optimized RL algorithms to ensure that our robots learn well. Research into driverless or self-driving automobiles has exploded in the last few years. A precise estimation of the vehicle's motion is crucial for higher levels of autonomous driving functionality. Recent research has been done on the development of sensors to improve the localization accuracy of these vehicles. Recent sensor odometry research suggests that Lidar Monocular Visual Odometry (LIMO) can be beneficial for determining odometry. However, the LIMO algorithm has a considerable number of errors when compared to ground truth, which motivates us to investigate ways to make it far more accurate. We intend to use a Genetic Algorithm (GA) in our dissertation to improve LIMO's performance. Robotic manipulator research has also been popular and has room for development, which piqued our interest. As a result, we researched robotic manipulators and applied GA to Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER) (GA+DDPG+HER). Finally, we kept researching DDPG and created an algorithm named AACHER. AACHER uses HER and many independent instances of actors and critics from the DDPG to increase a robot's learning effectiveness. AACHER is used to evaluate the results in both custom and existing robot environments.In the first part of our research, we discuss the LIMO algorithm, an odometry estimation technique that employs a camera and a Lidar for visual localization by tracking features from their measurements. LIMO can estimate sensor motion via Bundle Adjustment based on reliable keyframes. LIMO employs weights of the vegetative landmarks and semantic labeling to reject outliers. LIMO, like many other odometry estimating methods, has the issue of having a lot of hyperparameters that need to be manually modified in response to dynamic changes in the environment to reduce translational errors. The GA has been proven to be useful in determining near-optimal values of learning hyperparameters. In our study, we present and propose the application of the GA to maximize the performance of LIMO's localization and motion estimates by optimizing its hyperparameters. We test our approach using the well-known KITTI dataset and demonstrate how the GA supports LIMO to lower translation errors in various datasets. Our second contribution includes the use of RL. Robots using RL can select actions based on a reward function. On the other hand, the choice of values for the learning algorithm's hyperparameters could have a big impact on the entire learning process. We used GA to find the hyperparameters for the Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER). We proposed the algorithm GA+DDPG+HER to optimize learning hyperparameters and applied it to the robotic manipulation tasks of FetchReach, FetchSlide, FetchPush, FetchPick\&Place, and DoorOpening. With only a few modifications, our proposed GA+DDPG+HER was also used in the AuboReach environment. Compared to the original algorithm (DDPG+HER), our experiments show that our approach (GA+DDPG+HER) yields noticeably better results and is substantially faster. In the final part of our dissertation, we were motivated to use and improve DDPG. Many simulated continuous control problems have shown promising results for the DDPG, a unique Deep Reinforcement Learning (DRL) technique. DDPG has two parts: Actor learning and Critic learning. The performance of the DDPG technique is therefore relatively sensitive and unstable because actor and critic learning is a considerable contributor to the robot’s total learning. Our dissertation suggests a multi-actor-critic DDPG for reliable actor-critic learning as an improved DDPG to further enhance the performance and stability of DDPG. This multi-actor-critic DDPG is further combined with HER, called AACHER. The average value of numerous actors/critics is used to replace the single actor/critic in the traditional DDPG approach for improved resistance when one actor/critic performs poorly. Numerous independent actors and critics can also learn from the environment in general. In all the actor/critic number combinations that are evaluated, AACHER performs better than DDPG+HER

    Machine Learning Meets Advanced Robotic Manipulation

    Full text link
    Automated industries lead to high quality production, lower manufacturing cost and better utilization of human resources. Robotic manipulator arms have major role in the automation process. However, for complex manipulation tasks, hard coding efficient and safe trajectories is challenging and time consuming. Machine learning methods have the potential to learn such controllers based on expert demonstrations. Despite promising advances, better approaches must be developed to improve safety, reliability, and efficiency of ML methods in both training and deployment phases. This survey aims to review cutting edge technologies and recent trends on ML methods applied to real-world manipulation tasks. After reviewing the related background on ML, the rest of the paper is devoted to ML applications in different domains such as industry, healthcare, agriculture, space, military, and search and rescue. The paper is closed with important research directions for future works

    Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning

    Full text link
    Problems which require both long-horizon planning and continuous control capabilities pose significant challenges to existing reinforcement learning agents. In this paper we introduce a novel hierarchical reinforcement learning agent which links temporally extended skills for continuous control with a forward model in a symbolic discrete abstraction of the environment's state for planning. We term our agent SEADS for Symbolic Effect-Aware Diverse Skills. We formulate an objective and corresponding algorithm which leads to unsupervised learning of a diverse set of skills through intrinsic motivation given a known state abstraction. The skills are jointly learned with the symbolic forward model which captures the effect of skill execution in the state abstraction. After training, we can leverage the skills as symbolic actions using the forward model for long-horizon planning and subsequently execute the plan using the learned continuous-action control skills. The proposed algorithm learns skills and forward models that can be used to solve complex tasks which require both continuous control and long-horizon planning capabilities with high success rate. It compares favorably with other flat and hierarchical reinforcement learning baseline agents and is successfully demonstrated with a real robot.Comment: Project website (including video) is available at https://seads.is.tue.mpg.de/. (v2) Accepted for publication at the 6th Conference on Robot Learning (CoRL) 2022, Auckland, New Zealand. (v3) Added details on checkpointing (S.8.1), with references on p.7, p.8, p.21 to clarify number of env. steps of reported result

    A survey and tutorial on deep reinforcement learning algorithms for robotic manipulation

    Get PDF
    Robotic manipulation challenges, such as grasping and object manipulation, have been tackled successfully with the help of deep reinforcement learning systems. We give an overview of the recent advances in deep reinforcement learning algorithms for robotic manipulation tasks in this review. We begin by outlining the fundamental ideas of reinforcement learning and the parts of a reinforcement learning system. The many deep reinforcement learning algorithms, such as value-based methods, policy-based methods, and actor–critic approaches, that have been suggested for robotic manipulation tasks are then covered. We also examine the numerous issues that have arisen when applying these algorithms to robotics tasks, as well as the various solutions that have been put forth to deal with these issues. Finally, we highlight several unsolved research issues and talk about possible future directions for the subject

    Aprendizagem profunda por reforço para tarefas de manipulação robótica

    Get PDF
    The recent advances in Artificial Intelligence (AI) present new opportunities for robotics on many fronts. Deep Reinforcement Learning (DRL) is a sub-field of AI which results from the combination of Deep Learning (DL) and Reinforcement Learning (RL). It categorizes machine learning algorithms which learn directly from experience and offers a comprehensive framework for studying the interplay among learning, representation and decision-making. It has already been successfully used to solve tasks in many domains. Most notably, DRL agents learned to play Atari 2600 video games directly from pixels and achieved human comparable performance in 49 of those games. Additionally, recent efforts using DRL in conjunction with other techniques produced agents capable of playing the board game of Go at a professional level, which has long been viewed as an intractable problem due to its enormous search space. In the context of robotics, DRL is often applied to planning, navigation, optimal control and others. Here, the powerful function approximation and representation learning properties of Deep Neural Networks enable RL to scale up to problems with highdimensional state and action spaces. Additionally, inherent properties of DRL make transfer learning useful when moving from simulation to the real world. This dissertation aims to investigate the applicability and effectiveness of DRL to learn successful policies on the domain of robot manipulator tasks. Initially, a set of three classic RL problems were solved using RL and DRL algorithms in order to explore their practical implementation and arrive at class of algorithms appropriate for these robotic tasks. Afterwards, a task in simulation is defined such that an agent is set to control a 6 DoF manipulator to reach a target with its end effector. This is used to evaluate the effects on performance of different state representations, hyperparameters and state-of-the-art DRL algorithms, resulting in agents with high success rates. The emphasis is then placed on the speed and time restrictions of the end effector's positioning. To this end, different reward systems were tested for an agent learning a modified version of the previous reaching task with faster joint speeds. In this setting, a number of improvements were verified in relation to the original reward system. Finally, an application of the best reaching agent obtained from the previous experiments is demonstrated on a simplified ball catching scenario.Os avanços recentes na Inteligência Artificial (IA) demonstram um conjunto de novas oportunidades para a robótica. A Aprendizagem Profunda por Reforço (DRL) é uma subárea da IA que resulta da combinação de Aprendizagem Profunda (DL) com Aprendizagem por Reforço (RL). Esta subárea define algoritmos de aprendizagem automática que aprendem diretamente por experiência e oferece uma abordagem compreensiva para o estudo da interação entre aprendizagem, representação e a decisão. Estes algoritmos já têm sido utilizados com sucesso em diferentes domínios. Nomeadamente, destaca-se a aplicação de agentes de DRL que aprenderam a jogar vídeo jogos da consola Atari 2600 diretamente a partir de pixels e atingiram um desempenho comparável a humanos em 49 desses jogos. Mais recentemente, a DRL em conjunto com outras técnicas originou agentes capazes de jogar o jogo de tabuleiro Go a um nível profissional, algo que até ao momento era visto como um problema demasiado complexo para ser resolvido devido ao seu enorme espaço de procura. No âmbito da robótica, a DRL tem vindo a ser utilizada em problemas de planeamento, navegação, controlo ótimo e outros. Nestas aplicações, as excelentes capacidades de aproximação de funções e aprendizagem de representação das Redes Neuronais Profundas permitem à RL escalar a problemas com espaços de estado e ação multidimensionais. Adicionalmente, propriedades inerentes à DRL fazem a transferência de aprendizagem útil ao passar da simulação para o mundo real. Esta dissertação visa investigar a aplicabilidade e eficácia de técnicas de DRL para aprender políticas de sucesso no domínio das tarefas de manipulação robótica. Inicialmente, um conjunto de três problemas clássicos de RL foram resolvidos utilizando algoritmos de RL e DRL de forma a explorar a sua implementação prática e chegar a uma classe de algoritmos apropriados para estas tarefas de robótica. Posteriormente, foi definida uma tarefa em simulação onde um agente tem como objetivo controlar um manipulador com 6 graus de liberdade de forma a atingir um alvo com o seu terminal. Esta é utilizada para avaliar o efeito no desempenho de diferentes representações do estado, hiperparâmetros e algoritmos do estado da arte de DRL, o que resultou em agentes com taxas de sucesso elevadas. O foco é depois colocado na velocidade e restrições de tempo do posicionamento do terminal. Para este fim, diferentes sistemas de recompensa foram testados para que um agente possa aprender uma versão modificada da tarefa anterior para velocidades de juntas superiores. Neste cenário, foram verificadas várias melhorias em relação ao sistema de recompensa original. Finalmente, uma aplicação do melhor agente obtido nas experiências anteriores é demonstrada num cenário implicado de captura de bola.Mestrado em Engenharia de Computadores e Telemátic

    Autonomous Soft Tissue Retraction Using Demonstration-Guided Reinforcement Learning

    Full text link
    In the context of surgery, robots can provide substantial assistance by performing small, repetitive tasks such as suturing, needle exchange, and tissue retraction, thereby enabling surgeons to concentrate on more complex aspects of the procedure. However, existing surgical task learning mainly pertains to rigid body interactions, whereas the advancement towards more sophisticated surgical robots necessitates the manipulation of soft bodies. Previous work focused on tissue phantoms for soft tissue task learning, which can be expensive and can be an entry barrier to research. Simulation environments present a safe and efficient way to learn surgical tasks before their application to actual tissue. In this study, we create a Robot Operating System (ROS)-compatible physics simulation environment with support for both rigid and soft body interactions within surgical tasks. Furthermore, we investigate the soft tissue interactions facilitated by the patient-side manipulator of the DaVinci surgical robot. Leveraging the pybullet physics engine, we simulate kinematics and establish anchor points to guide the robotic arm when manipulating soft tissue. Using demonstration-guided reinforcement learning (RL) algorithms, we investigate their performance in comparison to traditional reinforcement learning algorithms. Our in silico trials demonstrate a proof-of-concept for autonomous surgical soft tissue retraction. The results corroborate the feasibility of learning soft body manipulation through the application of reinforcement learning agents. This work lays the foundation for future research into the development and refinement of surgical robots capable of managing both rigid and soft tissue interactions. Code is available at https://github.com/amritpal-001/tissue_retract.Comment: 10 Pages, 5 figures, MICCAI 2023 conference (AECAI workshop

    Real-Time Hybrid Visual Servoing of a Redundant Manipulator via Deep Reinforcement Learning

    Get PDF
    Fixtureless assembly may be necessary in some manufacturing tasks and environ-ments due to various constraints but poses challenges for automation due to non-deterministic characteristics not favoured by traditional approaches to industrial au-tomation. Visual servoing methods of robotic control could be effective for sensitive manipulation tasks where the desired end-effector pose can be ascertained via visual cues. Visual data is complex and computationally expensive to process but deep reinforcement learning has shown promise for robotic control in vision-based manipu-lation tasks. However, these methods are rarely used in industry due to the resources and expertise required to develop application-specific systems and prohibitive train-ing costs. Training reinforcement learning models in simulated environments offers a number of benefits for the development of robust robotic control algorithms by reducing training time and costs, and providing repeatable benchmarks for which algorithms can be tested, developed and eventually deployed on real robotic control environments. In this work, we present a new simulated reinforcement learning envi-ronment for developing accurate robotic manipulation control systems in fixtureless environments. Our environment incorporates a contemporary collaborative industrial robot, the KUKA LBR iiwa, with the goal of positioning its end effector in a generic fixtureless environment based on a visual cue. Observational inputs are comprised of the robotic joint positions and velocities, as well as two cameras, whose positioning reflect hybrid visual servoing with one camera attached to the robotic end-effector, and another observing the workspace respectively. We propose a state-of-the-art deep reinforcement learning approach to solving the task environment and make prelimi-nary assessments of the efficacy of this approach to hybrid visual servoing methods for the defined problem environment. We also conduct a series of experiments ex-ploring the hyperparameter space in the proposed reinforcement learning method. Although we could not prove the efficacy of a deep reinforcement approach to solving the task environment with our initial results, we remain confident that such an ap-proach could be feasible to solving this industrial manufacturing challenge and that our contributions in this work in terms of the novel software provide a good basis for the exploration of reinforcement learning approaches to hybrid visual servoing in accurate manufacturing contexts
    corecore