102 research outputs found
Reinforcement Learning Experiments and Benchmark for Solving Robotic Reaching Tasks
Reinforcement learning has shown great promise in robotics thanks to its
ability to develop efficient robotic control procedures through self-training.
In particular, reinforcement learning has been successfully applied to solving
the reaching task with robotic arms. In this paper, we define a robust,
reproducible and systematic experimental procedure to compare the performance
of various model-free algorithms at solving this task. The policies are trained
in simulation and are then transferred to a physical robotic manipulator. It is
shown that augmenting the reward signal with the Hindsight Experience Replay
exploration technique increases the average return of off-policy agents between
7 and 9 folds when the target position is initialised randomly at the beginning
of each episode
Deep Reinforcement Learning for Robotic Tasks: Manipulation and Sensor Odometry
Research in robotics has frequently focused on artificial intelligence (AI). To increase the effectiveness of the learning process for the robot, numerous studies have been carried out. To be more effective, robots must be able to learn effectively in a shorter amount of time and with fewer resources. It has been established that reinforcement learning (RL) is efficient for aiding a robot's learning. In this dissertation, we proposed and optimized RL algorithms to ensure that our robots learn well. Research into driverless or self-driving automobiles has exploded in the last few years. A precise estimation of the vehicle's motion is crucial for higher levels of autonomous driving functionality. Recent research has been done on the development of sensors to improve the localization accuracy of these vehicles. Recent sensor odometry research suggests that Lidar Monocular Visual Odometry (LIMO) can be beneficial for determining odometry. However, the LIMO algorithm has a considerable number of errors when compared to ground truth, which motivates us to investigate ways to make it far more accurate. We intend to use a Genetic Algorithm (GA) in our dissertation to improve LIMO's performance. Robotic manipulator research has also been popular and has room for development, which piqued our interest. As a result, we researched robotic manipulators and applied GA to Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER) (GA+DDPG+HER). Finally, we kept researching DDPG and created an algorithm named AACHER. AACHER uses HER and many independent instances of actors and critics from the DDPG to increase a robot's learning effectiveness. AACHER is used to evaluate the results in both custom and existing robot environments.In the first part of our research, we discuss the LIMO algorithm, an odometry estimation technique that employs a camera and a Lidar for visual localization by tracking features from their measurements. LIMO can estimate sensor motion via Bundle Adjustment based on reliable keyframes. LIMO employs weights of the vegetative landmarks and semantic labeling to reject outliers. LIMO, like many other odometry estimating methods, has the issue of having a lot of hyperparameters that need to be manually modified in response to dynamic changes in the environment to reduce translational errors. The GA has been proven to be useful in determining near-optimal values of learning hyperparameters. In our study, we present and propose the application of the GA to maximize the performance of LIMO's localization and motion estimates by optimizing its hyperparameters. We test our approach using the well-known KITTI dataset and demonstrate how the GA supports LIMO to lower translation errors in various datasets. Our second contribution includes the use of RL. Robots using RL can select actions based on a reward function. On the other hand, the choice of values for the learning algorithm's hyperparameters could have a big impact on the entire learning process. We used GA to find the hyperparameters for the Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER). We proposed the algorithm GA+DDPG+HER to optimize learning hyperparameters and applied it to the robotic manipulation tasks of FetchReach, FetchSlide, FetchPush, FetchPick\&Place, and DoorOpening. With only a few modifications, our proposed GA+DDPG+HER was also used in the AuboReach environment. Compared to the original algorithm (DDPG+HER), our experiments show that our approach (GA+DDPG+HER) yields noticeably better results and is substantially faster. In the final part of our dissertation, we were motivated to use and improve DDPG. Many simulated continuous control problems have shown promising results for the DDPG, a unique Deep Reinforcement Learning (DRL) technique. DDPG has two parts: Actor learning and Critic learning. The performance of the DDPG technique is therefore relatively sensitive and unstable because actor and critic learning is a considerable contributor to the robot’s total learning. Our dissertation suggests a multi-actor-critic DDPG for reliable actor-critic learning as an improved DDPG to further enhance the performance and stability of DDPG. This multi-actor-critic DDPG is further combined with HER, called AACHER. The average value of numerous actors/critics is used to replace the single actor/critic in the traditional DDPG approach for improved resistance when one actor/critic performs poorly. Numerous independent actors and critics can also learn from the environment in general. In all the actor/critic number combinations that are evaluated, AACHER performs better than DDPG+HER
Machine Learning Meets Advanced Robotic Manipulation
Automated industries lead to high quality production, lower manufacturing
cost and better utilization of human resources. Robotic manipulator arms have
major role in the automation process. However, for complex manipulation tasks,
hard coding efficient and safe trajectories is challenging and time consuming.
Machine learning methods have the potential to learn such controllers based on
expert demonstrations. Despite promising advances, better approaches must be
developed to improve safety, reliability, and efficiency of ML methods in both
training and deployment phases. This survey aims to review cutting edge
technologies and recent trends on ML methods applied to real-world manipulation
tasks. After reviewing the related background on ML, the rest of the paper is
devoted to ML applications in different domains such as industry, healthcare,
agriculture, space, military, and search and rescue. The paper is closed with
important research directions for future works
Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning
Problems which require both long-horizon planning and continuous control
capabilities pose significant challenges to existing reinforcement learning
agents. In this paper we introduce a novel hierarchical reinforcement learning
agent which links temporally extended skills for continuous control with a
forward model in a symbolic discrete abstraction of the environment's state for
planning. We term our agent SEADS for Symbolic Effect-Aware Diverse Skills. We
formulate an objective and corresponding algorithm which leads to unsupervised
learning of a diverse set of skills through intrinsic motivation given a known
state abstraction. The skills are jointly learned with the symbolic forward
model which captures the effect of skill execution in the state abstraction.
After training, we can leverage the skills as symbolic actions using the
forward model for long-horizon planning and subsequently execute the plan using
the learned continuous-action control skills. The proposed algorithm learns
skills and forward models that can be used to solve complex tasks which require
both continuous control and long-horizon planning capabilities with high
success rate. It compares favorably with other flat and hierarchical
reinforcement learning baseline agents and is successfully demonstrated with a
real robot.Comment: Project website (including video) is available at
https://seads.is.tue.mpg.de/. (v2) Accepted for publication at the 6th
Conference on Robot Learning (CoRL) 2022, Auckland, New Zealand. (v3) Added
details on checkpointing (S.8.1), with references on p.7, p.8, p.21 to
clarify number of env. steps of reported result
A survey and tutorial on deep reinforcement learning algorithms for robotic manipulation
Robotic manipulation challenges, such as grasping and object manipulation, have been tackled successfully with the help of deep reinforcement learning systems. We give an overview of the recent advances in deep reinforcement learning algorithms for robotic manipulation tasks in this review. We begin by outlining the fundamental ideas of reinforcement learning and the parts of a reinforcement learning system. The many deep reinforcement learning algorithms, such as value-based methods, policy-based methods, and actor–critic approaches, that have been suggested for robotic manipulation tasks are then covered. We also examine the numerous issues that have arisen when applying these algorithms to robotics tasks, as well as the various solutions that have been put forth to deal with these issues. Finally, we highlight several unsolved research issues and talk about possible future directions for the subject
Aprendizagem profunda por reforço para tarefas de manipulação robótica
The recent advances in Artificial Intelligence (AI) present new opportunities
for robotics on many fronts. Deep Reinforcement Learning (DRL)
is a sub-field of AI which results from the combination of Deep Learning
(DL) and Reinforcement Learning (RL). It categorizes machine learning algorithms
which learn directly from experience and offers a comprehensive
framework for studying the interplay among learning, representation and
decision-making. It has already been successfully used to solve tasks in
many domains. Most notably, DRL agents learned to play Atari 2600 video
games directly from pixels and achieved human comparable performance in
49 of those games. Additionally, recent efforts using DRL in conjunction
with other techniques produced agents capable of playing the board game
of Go at a professional level, which has long been viewed as an intractable
problem due to its enormous search space. In the context of robotics, DRL
is often applied to planning, navigation, optimal control and others. Here,
the powerful function approximation and representation learning properties
of Deep Neural Networks enable RL to scale up to problems with highdimensional
state and action spaces. Additionally, inherent properties of
DRL make transfer learning useful when moving from simulation to the real
world. This dissertation aims to investigate the applicability and effectiveness
of DRL to learn successful policies on the domain of robot manipulator
tasks. Initially, a set of three classic RL problems were solved using RL and
DRL algorithms in order to explore their practical implementation and arrive
at class of algorithms appropriate for these robotic tasks. Afterwards, a task
in simulation is defined such that an agent is set to control a 6 DoF manipulator
to reach a target with its end effector. This is used to evaluate the
effects on performance of different state representations, hyperparameters
and state-of-the-art DRL algorithms, resulting in agents with high success
rates. The emphasis is then placed on the speed and time restrictions of the
end effector's positioning. To this end, different reward systems were tested
for an agent learning a modified version of the previous reaching task with
faster joint speeds. In this setting, a number of improvements were verified
in relation to the original reward system. Finally, an application of the best
reaching agent obtained from the previous experiments is demonstrated on
a simplified ball catching scenario.Os avanços recentes na Inteligência Artificial (IA) demonstram um conjunto
de novas oportunidades para a robótica. A Aprendizagem Profunda por
Reforço (DRL) é uma subárea da IA que resulta da combinação de Aprendizagem
Profunda (DL) com Aprendizagem por Reforço (RL). Esta subárea
define algoritmos de aprendizagem automática que aprendem diretamente
por experiência e oferece uma abordagem compreensiva para o estudo da
interação entre aprendizagem, representação e a decisão. Estes algoritmos
já têm sido utilizados com sucesso em diferentes domínios. Nomeadamente,
destaca-se a aplicação de agentes de DRL que aprenderam a jogar vídeo jogos
da consola Atari 2600 diretamente a partir de pixels e atingiram um
desempenho comparável a humanos em 49 desses jogos. Mais recentemente,
a DRL em conjunto com outras técnicas originou agentes capazes
de jogar o jogo de tabuleiro Go a um nível profissional, algo que até ao
momento era visto como um problema demasiado complexo para ser resolvido
devido ao seu enorme espaço de procura. No âmbito da robótica, a
DRL tem vindo a ser utilizada em problemas de planeamento, navegação,
controlo ótimo e outros. Nestas aplicações, as excelentes capacidades de
aproximação de funções e aprendizagem de representação das Redes Neuronais
Profundas permitem à RL escalar a problemas com espaços de estado
e ação multidimensionais. Adicionalmente, propriedades inerentes à DRL
fazem a transferência de aprendizagem útil ao passar da simulação para o
mundo real. Esta dissertação visa investigar a aplicabilidade e eficácia de
técnicas de DRL para aprender políticas de sucesso no domínio das tarefas
de manipulação robótica. Inicialmente, um conjunto de três problemas
clássicos de RL foram resolvidos utilizando algoritmos de RL e DRL de
forma a explorar a sua implementação prática e chegar a uma classe de
algoritmos apropriados para estas tarefas de robótica. Posteriormente, foi
definida uma tarefa em simulação onde um agente tem como objetivo controlar
um manipulador com 6 graus de liberdade de forma a atingir um alvo
com o seu terminal. Esta é utilizada para avaliar o efeito no desempenho
de diferentes representações do estado, hiperparâmetros e algoritmos do
estado da arte de DRL, o que resultou em agentes com taxas de sucesso
elevadas. O foco é depois colocado na velocidade e restrições de tempo
do posicionamento do terminal. Para este fim, diferentes sistemas de recompensa
foram testados para que um agente possa aprender uma versão
modificada da tarefa anterior para velocidades de juntas superiores. Neste
cenário, foram verificadas várias melhorias em relação ao sistema de recompensa
original. Finalmente, uma aplicação do melhor agente obtido nas
experiências anteriores é demonstrada num cenário implicado de captura
de bola.Mestrado em Engenharia de Computadores e Telemátic
Autonomous Soft Tissue Retraction Using Demonstration-Guided Reinforcement Learning
In the context of surgery, robots can provide substantial assistance by
performing small, repetitive tasks such as suturing, needle exchange, and
tissue retraction, thereby enabling surgeons to concentrate on more complex
aspects of the procedure. However, existing surgical task learning mainly
pertains to rigid body interactions, whereas the advancement towards more
sophisticated surgical robots necessitates the manipulation of soft bodies.
Previous work focused on tissue phantoms for soft tissue task learning, which
can be expensive and can be an entry barrier to research. Simulation
environments present a safe and efficient way to learn surgical tasks before
their application to actual tissue. In this study, we create a Robot Operating
System (ROS)-compatible physics simulation environment with support for both
rigid and soft body interactions within surgical tasks. Furthermore, we
investigate the soft tissue interactions facilitated by the patient-side
manipulator of the DaVinci surgical robot. Leveraging the pybullet physics
engine, we simulate kinematics and establish anchor points to guide the robotic
arm when manipulating soft tissue. Using demonstration-guided reinforcement
learning (RL) algorithms, we investigate their performance in comparison to
traditional reinforcement learning algorithms. Our in silico trials demonstrate
a proof-of-concept for autonomous surgical soft tissue retraction. The results
corroborate the feasibility of learning soft body manipulation through the
application of reinforcement learning agents. This work lays the foundation for
future research into the development and refinement of surgical robots capable
of managing both rigid and soft tissue interactions. Code is available at
https://github.com/amritpal-001/tissue_retract.Comment: 10 Pages, 5 figures, MICCAI 2023 conference (AECAI workshop
Real-Time Hybrid Visual Servoing of a Redundant Manipulator via Deep Reinforcement Learning
Fixtureless assembly may be necessary in some manufacturing tasks and environ-ments due to various constraints but poses challenges for automation due to non-deterministic characteristics not favoured by traditional approaches to industrial au-tomation. Visual servoing methods of robotic control could be effective for sensitive manipulation tasks where the desired end-effector pose can be ascertained via visual cues. Visual data is complex and computationally expensive to process but deep reinforcement learning has shown promise for robotic control in vision-based manipu-lation tasks. However, these methods are rarely used in industry due to the resources and expertise required to develop application-specific systems and prohibitive train-ing costs. Training reinforcement learning models in simulated environments offers a number of benefits for the development of robust robotic control algorithms by reducing training time and costs, and providing repeatable benchmarks for which algorithms can be tested, developed and eventually deployed on real robotic control environments. In this work, we present a new simulated reinforcement learning envi-ronment for developing accurate robotic manipulation control systems in fixtureless environments. Our environment incorporates a contemporary collaborative industrial robot, the KUKA LBR iiwa, with the goal of positioning its end effector in a generic fixtureless environment based on a visual cue. Observational inputs are comprised of the robotic joint positions and velocities, as well as two cameras, whose positioning reflect hybrid visual servoing with one camera attached to the robotic end-effector, and another observing the workspace respectively. We propose a state-of-the-art deep reinforcement learning approach to solving the task environment and make prelimi-nary assessments of the efficacy of this approach to hybrid visual servoing methods for the defined problem environment. We also conduct a series of experiments ex-ploring the hyperparameter space in the proposed reinforcement learning method. Although we could not prove the efficacy of a deep reinforcement approach to solving the task environment with our initial results, we remain confident that such an ap-proach could be feasible to solving this industrial manufacturing challenge and that our contributions in this work in terms of the novel software provide a good basis for the exploration of reinforcement learning approaches to hybrid visual servoing in accurate manufacturing contexts
- …