Search CORE

415 research outputs found

Learning visual docking for non-holonomic autonomous vehicles

Author: Duckett Tom
Martinez-Marin Tomas
Publication venue
Publication date: 01/06/2008
Field of study

This paper presents a new method of learning visual docking skills for non-holonomic vehicles by direct interaction with the environment. The method is based on a reinforcement algorithm, which speeds up Q-learning by applying memorybased sweeping and enforcing the “adjoining property”, a filtering mechanism to only allow transitions between states that satisfy a fixed distance. The method overcomes some limitations of reinforcement learning techniques when they are employed in applications with continuous non-linear systems, such as car-like vehicles. In particular, a good approximation to the optimal behaviour is obtained by a small look-up table. The algorithm is tested within an image-based visual servoing framework on a docking task. The training time was less than 1 hour on the real vehicle. In experiments, we show the satisfactory performance of the algorithm

University of Lincoln Institutional Repository

Crossref

충돌 학습을 통한 지역 경로 계획 방법

Author: 전호웅
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (석사)-- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2019. 2. 이범희.본 논문에서는 강화 학습 기반의 충돌 회피 방법을 제안한다. 충돌 회피란 로봇이 다른 로봇 또는 장애물과 충돌 없이 목표 지점에 도달하는 것을 목적으로 한다. 이 문제는 단일 로봇 충돌 회피와 다개체 로봇 충돌 회피, 이렇게 두 가지로 나눌 수 있다. 단일 로봇 충돌 회피 문제는 하나의 중심 로봇과 여러 개의 움직이는 장애물로 구성되어 있다. 중심 로봇은 랜덤하게 움직이는 장애물을 피해 목표 지점에 도달하는 것을 목적으로 한다. 다개체 로봇 충돌 회피 문제는 여러 대의 중심 로봇으로 구성되어 있다. 이 문제에도 역시 장애물을 포함시킬 수 있다. 중심 로봇들은 서로 충돌을 회피하면서 각자의 목표 지점에 도달하는 것을 목적으로 한다. 만약 환경에 예상치 못한 장애물이 등장하더라도, 로봇들은 그것들을 피해야 한다. 이 문제를 해결하기 위하여 본 논문에서는 충돌 회피를 위한 충돌 학습 방법 (CALC) 을 제안한다. CALC는 강화 학습 개념을 이용해 문제를 해결한다. 제안하는 방법은 학습 그리고 계획 이렇게 두 가지 환경으로 구성 된다. 학습 환경은 하나의 중심 로봇과 하나의 장애물 그리고 학습 영역으로 구성되어 있다. 학습 환경에서 중심 로봇은 장애물과 충돌하는 법을 학습하고 그에 대한 정책을 도출해 낸다. 즉, 중심 로봇이 장애물과 충돌하게 되면 그것은 양의 보상을 받는다. 그리고 만약 중심 로봇이 장애물과 충돌 하지 않고 학습 영역을 빠져나가면, 그것은 음의 보상을 받는다. 계획 환경은 여러 개의 장애물 또는 로봇들과 하나의 목표 지점으로 구성되어 있다. 학습 환경에서 학습한 정책을 통해 중심 로봇은 여러 대의 장애물 또는 로봇들과의 충돌을 피할 수 있다. 본 방법은 충돌을 학습 했기 때문에, 충돌을 회피하기 위해서는 도출된 정책을 뒤집어야 한다. 하지만, 목표 지점과는 일종의 `충돌'을 해야하기 때문에, 목표 지점에 대해서는 도출된 정책을 그대로 적용해야 한다. 이 두 가지 종류의 정책들을 융합하게 되면, 중심 로봇은 장애물 또는 로봇들과의 충돌을 회피하면서 동시에 목표 지점에 도달할 수 있다. 학습 환경에서 로봇은 홀로노믹 로봇을 가정한다. 학습된 정책이 홀로노믹 로봇을 기반으로 하더라도, 제안하는 방법은 홀로노믹 로봇과 비홀로노믹 로봇 모두에 적용이 가능하다. CALC는 다음의 세 가지 문제에 적용할 수 있다. 1) 홀로노믹 단일 로봇의 충돌 회피. 2) 비홀로노믹 단일 로봇의 충돌 회피. 3) 비홀로노믹 다개체 로봇의 충돌 회피. 제안된 방법은 시뮬레이션과 실제 로봇 환경에서 실험 되었다. 시뮬레이션은 로봇 운영체제 (ROS) 기반의 시뮬레이터인 가제보와 게임 라이브러리의 한 종류인 PyGame을 사용하였다. 시뮬레이션에서는 홀로노믹과 비홀로노믹 로봇을 모두 사용하여 실험을 진행하였다. 실제 로봇 환경 실험에서는 비홀로노믹 로봇의 한 종류인 e-puck 로봇을 사용하였다. 또한, 시뮬레이션에서 학습된 정책은 실제 로봇 환경 실험에서 재학습 또는 별도의 수정과정 없이 바로 적용이 가능하였다. 이러한 실험들의 결과를 통해 제안된 방법은 Reciprocal Velocity Obstacle (RVO) 또는 Optimal Reciprocal Collision Avoidance (ORCA)와 같은 기존의 방법들과 비교하였을 때 향상된 성능을 보였다. 게다가, 학습의 효율성 또한 기존의 학습 기반의 방법들에 비해 높은 결과를 보였다.This thesis proposes a reinforcement learning based collision avoidance method. The problem can be defined as an ability of a robot to reach its goal point without colliding with other robots and obstacles. There are two kinds of collision avoidance problem, single robot and multi-robot collision avoidance. Single robot collision avoidance problem contains multiple dynamic obstacles and one agent robot. The objective of the agent robot is to reach its goal point and avoid obstacles with random dynamics. Multi-robot collision avoidance problem contains multiple agent robots. It is also possible to include unknown dynamic obstacles to the problem. The agents should reach their own goal points without colliding with each other. If the environment contains unknown obstacles, the agents should avoid them also. To solve the problems, Collision Avoidance by Learning Collision (CALC) is proposed. CALC adopts the concept of reinforcement learning. The method is divided into two environments, training and planning. The training environment consists of one agent, one obstacle, and a training range. In the training environment, the agent learns how to collide with the obstacle and generates a colliding policy. In other words, when the agent collides with the obstacle, it receives positive reward. On the other hand, when the agent escapes the training range without collision, it receives negative reward. The planning environment contains multiple obstacles or robots and a single goal point. With the trained policy, the agent can solve the collision avoidance problem in the planning environment regardless of its dimension. Since the method learned collision, the generated policy should be inverted in the planning environment to avoid obstacles or robots. However, the policy should be applied directly for the goal point so that the agent can `collide' with the goal. With the combination of both policies, the agent can avoid the obstacles or robots and reach to the goal point simultaneously. In the training algorithm, the robot is assumed to be a holonomic robot. Even though the trained policy is generated from the holonomic robot, the method can be applied to both holonomic and non-holonomic robots by holonomic to non-holonomic converting method. CALC is applied to three problems, single holonomic robot, single non-holonomic robot, and multiple non-holonomic robot collision avoidance. The proposed method is validated both in the robot simulation and real-world experiment. For simulation, Robot Operating System (ROS) based simulator called Gazebo and simple game library PyGame are used. The method is tested with both holonomic and non-holonomic robots in the simulation experiment. For real-world planning experiment, non-holonomic mobile robot named e-puck is used. The learned policy from the simulation can be directly applied to the real-world robot without any calibration or retraining. The result shows that the proposed method outperforms the existing methods such as Reciprocal Velocity Obstacle (RVO), PrEference Appraisal Reinforcement Learning (PEARL), and Optimal Reciprocal Collision Avoidance (ORCA). In addition, it is shown that the proposed method is more efficient in terms of learning than existing learning-based method.1. Introduction 1 1.1 Motivations 1 1.2 Contributions 6 1.3 Organizations 7 2 Related Work 8 2.1 Reinforcement Learning 8 2.2 Classical Navigation Methods 11 2.3 Learning-Based Navigation Methods 13 3. Learning Collision 17 3.1 Introduction 17 3.2 Learning Collision 18 3.2.1 Markov Decision Process Setup 18 3.2.2 Training Algorithm 19 3.2.3 Experimental Results 22 4. Single Robot Collision Avoidance 25 4.1 Introduction 25 4.2 Holonomic Robot Obstacle Avoidance 26 4.2.1 Approach 26 4.2.2 Experimental Results 29 4.3 Non-Holonomic Robot Obstacle Avoidance 31 4.3.1 Approach 31 4.3.2 Experimental Results 33 5. Multi-Robot Collision Avoidance 36 5.1 Introduction 36 5.2 Approach 37 5.3 Experimental Results 40 5.3.1 Simulated Experiment 40 5.3.2 Real-World Experiment 44 5.3.3 Holonomic to Non-Holonomic Conversion Experiment 49 6. Conclusion 52 Bibliography 55 초록 62 감사의 글 64Maste

SNU Open Repository and Archive

Modeling and Control Strategies for a Two-Wheel Balancing Mobile Robot

Author: Moritz John Alan
Publication venue: ScholarWorks@UARK
Publication date: 01/10/2023
Field of study

The problem of balancing and autonomously navigating a two-wheel mobile robot is an increasingly active area of research, due to its potential applications in last-mile delivery, pedestrian transportation, warehouse automation, parts supply, agriculture, surveillance, and monitoring. This thesis investigates the design and control of a two-wheel balancing mobile robot using three different control strategies: Proportional Integral Derivative (PID) controllers, Sliding Mode Control, and Deep Q-Learning methodology. The mobile robot is modeled using a dynamic and kinematic model, and its motion is simulated in a custom MATLAB/Simulink environment. The first part of the thesis focuses on developing a dynamic and kinematic model for the mobile robot. The robot dynamics is derived using the classical Euler-Lagrange method, where motion can be described using potential and kinetic energies of the bodies. Non-holonomic constraints are included in the model to achieve desired motion, such as non-drifting of the mobile robot. These non-holonomic constraints are included using the method of Lagrange multipliers. Navigation for the robot is developed using artificial potential field path planning to generate a map of velocity vectors that are used for the set points for linear velocity and yaw rate. The second part of the thesis focuses on developing and evaluating three different control strategies for the mobile robot: PID controllers, Hierarchical Sliding Mode Control, and Deep-Q-Learning. The performances of the different control strategies are evaluated and compared based on various metrics, such as stability, robustness to mass variations and disturbances, and tracking accuracy. The implementation and evaluation of these strategies are modeled tested in a MATLAB/SIMULINK virtual environment

UARK (University of Arkansas )

Modeling and Control Strategies for a Two-Wheel Balancing Mobile Robot

Author: Moritz John Alan
Publication venue: ScholarWorks@UARK
Publication date: 01/10/2023
Field of study

ScholarWorks@UARK

Deep Reinforcement Learning for Autonomous Navigation of Mobile Robots in Indoor Environments

Author: Vaidya Gargi Yatin
Publication venue
Publication date: 25/05/2022
Field of study

Conventional autonomous navigation framework for mobile robots is highly modularized with various subsystems such as localization, perception, mapping, planning and control. Although these provide easy interpretation, they are highly dependent on a known map of the robot’s surroundings for navigating in a cluttered environment. Local planners such as DWA require a map with all obstacles in the surroundings to calculate an optimal collision-free trajectory to the goal. Planning and tracking a collision-free path without knowing the obstacle locations is a challenging task. Since the advent of deep learning techniques, the field of deep reinforcement learning has proven to be a powerful learning framework for robotic tasks. Deep Reinforcement Learning has demonstrated wide success in various complex computer games such as Go and StarCraft which have high dimensional state and action spaces. However, it has rarely been used in real world applications due to the Sim-2-Real challenges in transferring the trained RL policy into the real-world. In this work, we propose a novel framework for autonomously navigating a mobile robot in a cluttered space without known localization of the obstacles in its surroundings using deep reinforcement learning techniques. The proposed method is a modular and scalable approach due to a strategic design of the training environment. It uses constrained space and randomization techniques to learn an effective reinforcement learning policy in lesser simulation training time. The state vector consists of the target location in the mobile robot coordinate frame and additionally a 36-dimensional lidar vector for obstacle avoidance task. We demonstrate the optimal discrete action policy on a Turtlebot in the real-world. We have also addressed some key challenges in robot pose estimation for autonomous driving tasks

Texas A&M Repository