Search CORE

789 research outputs found

Residual Reactive Navigation: Combining Classical and Learned Navigation Strategies For Deployment in Unknown Environments

Author: Dasagi Vibhavari
Milford Michael
Rana Krishan
Sünderhauf Niko
Talbot Ben
Publication venue
Publication date: 11/03/2020
Field of study

In this work we focus on improving the efficiency and generalisation of learned navigation strategies when transferred from its training environment to previously unseen ones. We present an extension of the residual reinforcement learning framework from the robotic manipulation literature and adapt it to the vast and unstructured environments that mobile robots can operate in. The concept is based on learning a residual control effect to add to a typical sub-optimal classical controller in order to close the performance gap, whilst guiding the exploration process during training for improved data efficiency. We exploit this tight coupling and propose a novel deployment strategy, switching Residual Reactive Navigation (sRRN), which yields efficient trajectories whilst probabilistically switching to a classical controller in cases of high policy uncertainty. Our approach achieves improved performance over end-to-end alternatives and can be incorporated as part of a complete navigation stack for cluttered indoor navigation tasks in the real world. The code and training environment for this project is made publicly available at https://sites.google.com/view/srrn/home.Comment: Accepted as a conference paper at ICRA2020. Project site available at https://sites.google.com/view/srrn/hom

arXiv.org e-Print Archive

Crossref

Queensland University of Technology ePrints Archive

Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning

Author: Agrawal Pulkit
Darrell Trevor
Jabri Allan
Li Richard
Publication venue
Publication date: 23/12/2019
Field of study

Learning robotic manipulation tasks using reinforcement learning with sparse rewards is currently impractical due to the outrageous data requirements. Many practical tasks require manipulation of multiple objects, and the complexity of such tasks increases with the number of objects. Learning from a curriculum of increasingly complex tasks appears to be a natural solution, but unfortunately, does not work for many scenarios. We hypothesize that the inability of the state-of-the-art algorithms to effectively utilize a task curriculum stems from the absence of inductive biases for transferring knowledge from simpler to complex tasks. We show that graph-based relational architectures overcome this limitation and enable learning of complex tasks when provided with a simple curriculum of tasks with increasing numbers of objects. We demonstrate the utility of our framework on a simulated block stacking task. Starting from scratch, our agent learns to stack six blocks into a tower. Despite using step-wise sparse rewards, our method is orders of magnitude more data-efficient and outperforms the existing state-of-the-art method that utilizes human demonstrations. Furthermore, the learned policy exhibits zero-shot generalization, successfully stacking blocks into taller towers and previously unseen configurations such as pyramids, without any further training.Comment: 10 pages, 4 figures and 1 table in main article, 3 figures and 3 tables in appendix. Supplementary website and videos at https://richardrl.github.io/relational-rl

arXiv.org e-Print Archive

Crossref

DSpace@MIT

The State of Lifelong Learning in Service Robots: Current Bottlenecks in Object Perception and Manipulation

Author: Kasaei S. Hamidreza
Melsen Jorik
Steenkist Christiaan
van Beers Floris
Voncina Klemen
Publication venue
Publication date: 16/08/2020
Field of study

Service robots are appearing more and more in our daily life. The development of service robots combines multiple fields of research, from object perception to object manipulation. The state-of-the-art continues to improve to make a proper coupling between object perception and manipulation. This coupling is necessary for service robots not only to perform various tasks in a reasonable amount of time but also to continually adapt to new environments and safely interact with non-expert human users. Nowadays, robots are able to recognize various objects, and quickly plan a collision-free trajectory to grasp a target object in predefined settings. Besides, in most of the cases, there is a reliance on large amounts of training data. Therefore, the knowledge of such robots is fixed after the training phase, and any changes in the environment require complicated, time-consuming, and expensive robot re-programming by human experts. Therefore, these approaches are still too rigid for real-life applications in unstructured environments, where a significant portion of the environment is unknown and cannot be directly sensed or controlled. In such environments, no matter how extensive the training data used for batch learning, a robot will always face new objects. Therefore, apart from batch learning, the robot should be able to continually learn about new object categories and grasp affordances from very few training examples on-site. Moreover, apart from robot self-learning, non-expert users could interactively guide the process of experience acquisition by teaching new concepts, or by correcting insufficient or erroneous concepts. In this way, the robot will constantly learn how to help humans in everyday tasks by gaining more and more experiences without the need for re-programming

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

의미론적 환경 이해 기반 인간 로봇 협업

Author: 문지윤
Publication venue: 서울대학교 대학원
Publication date: 01/02/2020
Field of study

학위논문(박사)--서울대학교 대학원 :공과대학 전기·정보공학부,2020. 2. 이범희.Human-robot cooperation is unavoidable in various applications ranging from manufacturing to field robotics owing to the advantages of adaptability and high flexibility. Especially, complex task planning in large, unconstructed, and uncertain environments can employ the complementary capabilities of human and diverse robots. For a team to be effectives, knowledge regarding team goals and current situation needs to be effectively shared as they affect decision making. In this respect, semantic scene understanding in natural language is one of the most fundamental components for information sharing between humans and heterogeneous robots, as robots can perceive the surrounding environment in a form that both humans and other robots can understand. Moreover, natural-language-based scene understanding can reduce network congestion and improve the reliability of acquired data. Especially, in field robotics, transmission of raw sensor data increases network bandwidth and decreases quality of service. We can resolve this problem by transmitting information in the form of natural language that has encoded semantic representations of environments. In this dissertation, I introduce a human and heterogeneous robot cooperation scheme based on semantic scene understanding. I generate sentences and scene graphs, which is a natural language grounded graph over the detected objects and their relationships, with the graph map generated using a robot mapping algorithm. Subsequently, a framework that can utilize the results for cooperative mission planning of humans and robots is proposed. Experiments were performed to verify the effectiveness of the proposed methods. This dissertation comprises two parts: graph-based scene understanding and scene understanding based on the cooperation between human and heterogeneous robots. For the former, I introduce a novel natural language processing method using a semantic graph map. Although semantic graph maps have been widely applied to study the perceptual aspects of the environment, such maps do not find extensive application in natural language processing tasks. Several studies have been conducted on the understanding of workspace images in the field of computer vision; in these studies, the sentences were automatically generated, and therefore, multiple scenes have not yet been utilized for sentence generation. A graph-based convolutional neural network, which comprises spectral graph convolution and graph coarsening, and a recurrent neural network are employed to generate sentences attention over graphs. The proposed method outperforms the conventional methods on a publicly available dataset for single scenes and can be utilized for sequential scenes. Recently, deep learning has demonstrated impressive developments in scene understanding using natural language. However, it has not been extensively applied to high-level processes such as causal reasoning, analogical reasoning, or planning. The symbolic approach that calculates the sequence of appropriate actions by combining the available skills of agents outperforms in reasoning and planning; however, it does not entirely consider semantic knowledge acquisition for human-robot information sharing. An architecture that combines deep learning techniques and symbolic planner for human and heterogeneous robots to achieve a shared goal based on semantic scene understanding is proposed for scene understanding based on human-robot cooperation. In this study, graph-based perception is used for scene understanding. A planning domain definition language (PDDL) planner and JENA-TDB are utilized for mission planning and data acquisition storage, respectively. The effectiveness of the proposed method is verified in two situations: a mission failure, in which the dynamic environment changes, and object detection in a large and unseen environment.인간과 이종 로봇 간의 협업은 높은 유연성과 적응력을 보일 수 있다는 점에서 제조업에서 필드 로보틱스까지 다양한 분야에서 필연적이다. 특히, 서로 다른 능력을 지닌 로봇들과 인간으로 구성된 하나의 팀은 넓고 정형화되지 않은 공간에서 서로의 능력을 보완하며 복잡한 임무 수행을 가능하게 한다는 점에서 큰 장점을 갖는다. 효율적인 한 팀이 되기 위해서는, 팀의 공통된 목표 및 각 팀원의 현재 상황에 관한 정보를 실시간으로 공유할 수 있어야 하며 함께 의사 결정을 할 수 있어야 한다. 이러한 관점에서, 자연어를 통한 의미론적 환경 이해는 인간과 서로 다른 로봇들이 모두 이해할 수 있는 형태로 환경을 인지한다는 점에서 가장 필수적인 요소이다. 또한, 우리는 자연어 기반 환경 이해를 통해 네트워크 혼잡을 피함으로써 획득한 정보의 신뢰성을 높일 수 있다. 특히, 대량의 센서 데이터 전송에 의해 네트워크 대역폭이 증가하고 통신 QoS (Quality of Service) 신뢰도가 감소하는 문제가 빈번히 발생하는 필드 로보틱스 영역에서는 의미론적 환경 정보인 자연어를 전송함으로써 통신 대역폭을 감소시키고 통신 QoS 신뢰도를 증가시킬 수 있다. 본 학위 논문에서는 환경의 의미론적 이해 기반 인간 로봇 협동 방법에 대해 소개한다. 먼저, 로봇의 지도 작성 알고리즘을 통해 획득한 그래프 지도를 이용하여 자연어 문장과 검출한 객체 및 각 객체 간의 관계를 자연어 단어로 표현하는 그래프를 생성한다. 그리고 자연어 처리 결과를 이용하여 인간과 다양한 로봇들이 함께 협업하여 임무를 수행할 수 있도록 하는 프레임워크를 제안한다. 본 학위 논문은 크게 그래프를 이용한 의미론적 환경 이해와 의미론적 환경 이해를 통한 인간과 이종 로봇 간의 협업 방법으로 구성된다. 먼저, 그래프를 이용한 의미론적 환경 이해 부분에서는 의미론적 그래프 지도를 이용한 새로운 자연어 처리 방법에 대해 소개한다. 의미론적 그래프 지도 작성 방법은 로봇의 환경 인지 측면에서 많이 연구되었지만 이를 이용한 자연어 처리 방법은 거의 연구되지 않았다. 반면 컴퓨터 비전 분야에서는 이미지를 이용한 환경 이해 연구가 많이 이루어졌지만, 연속적인 장면들은 다루는데는 한계점이 있다. 따라서 우리는 그래프 스펙트럼 이론에 기반한 그래프 컨볼루션과 그래프 축소 레이어로 구성된 그래프 컨볼루션 신경망 및 순환 신경망을 이용하여 그래프를 설명하는 문장을 생성한다. 제안한 방법은 기존의 방법들보다 한 장면에 대해 향상된 성능을 보였으며 연속된 장면들에 대해서도 성공적으로 자연어 문장을 생성한다. 최근 딥러닝은 자연어 기반 환경 인지에 있어 급속도로 큰 발전을 이루었다. 하지만 인과 추론, 유추적 추론, 임무 계획과 같은 높은 수준의 프로세스에는 적용이 힘들다. 반면 임무를 수행하는 데 있어 각 에이전트의 능력에 맞게 행위들의 순서를 계산해주는 상징적 접근법(symbolic approach)은 추론과 임무 계획에 있어 뛰어난 성능을 보이지만 인간과 로봇들 사이의 의미론적 정보 공유 방법에 대해서는 거의 다루지 않는다. 따라서, 인간과 이종 로봇 간의 협업 방법 부분에서는 딥러닝 기법들과 상징적 플래너(symbolic planner)를 연결하는 프레임워크를 제안하여 의미론적 이해를 통한 인간 및 이종 로봇 간의 협업을 가능하게 한다. 우리는 의미론적 주변 환경 이해를 위해 이전 부분에서 제안한 그래프 기반 자연어 문장 생성을 수행한다. PDDL 플래너와 JENA-TDB는 각각 임무 계획 및 정보 획득 저장소로 사용한다. 제안한 방법의 효용성은 시뮬레이션을 통해 두 가지 상황에 대해서 검증한다. 하나는 동적 환경에서 임무 실패 상황이며 다른 하나는 넓은 공간에서 객체를 찾는 상황이다.1 Introduction 1 1.1 Background and Motivation 1 1.2 Literature Review 5 1.2.1 Natural Language-Based Human-Robot Cooperation 5 1.2.2 Artificial Intelligence Planning 5 1.3 The Problem Statement 10 1.4 Contributions 11 1.5 Dissertation Outline 12 2 Natural Language-Based Scene Graph Generation 14 2.1 Introduction 14 2.2 Related Work 16 2.3 Scene Graph Generation 18 2.3.1 Graph Construction 19 2.3.2 Graph Inference 19 2.4 Experiments 22 2.5 Summary 25 3 Language Description with 3D Semantic Graph 26 3.1 Introduction 26 3.2 Related Work 26 3.3 Natural Language Description 29 3.3.1 Preprocess 29 3.3.2 Graph Feature Extraction 33 3.3.3 Natural Language Description with Graph Features 34 3.4 Experiments 35 3.5 Summary 42 4 Natural Question with Semantic Graph 43 4.1 Introduction 43 4.2 Related Work 45 4.3 Natural Question Generation 47 4.3.1 Preprocess 49 4.3.2 Graph Feature Extraction 50 4.3.3 Natural Question with Graph Features 51 4.4 Experiments 52 4.5 Summary 58 5 PDDL Planning with Natural Language 59 5.1 Introduction 59 5.2 Related Work 60 5.3 PDDL Planning with Incomplete World Knowledge 61 5.3.1 Natural Language Process for PDDL Planning 63 5.3.2 PDDL Planning System 64 5.4 Experiments 65 5.5 Summary 69 6 PDDL Planning with Natural Language-Based Scene Understanding 70 6.1 Introduction 70 6.2 Related Work 74 6.3 A Framework for Heterogeneous Multi-Agent Cooperation 77 6.3.1 Natural Language-Based Cognition 78 6.3.2 Knowledge Engine 80 6.3.3 PDDL Planning Agent 81 6.4 Experiments 82 6.4.1 Experiment Setting 82 6.4.2 Scenario 84 6.4.3 Results 87 6.5 Summary 91 7 Conclusion 92Docto

SNU Open Repository and Archive

Deep Reinforcement Learning with Consensus for Manipulators

Author: Liu Wenxing
Publication venue
Publication date: 01/08/2023
Field of study

The University of Manchester - Institutional Repository

Deep learning based approaches for imitation learning.

Author: Hussein Ahmed
Publication venue
Publication date: 31/05/2018
Field of study

Imitation learning refers to an agent's ability to mimic a desired behaviour by learning from observations. The field is rapidly gaining attention due to recent advances in computational and communication capabilities as well as rising demand for intelligent applications. The goal of imitation learning is to describe the desired behaviour by providing demonstrations rather than instructions. This enables agents to learn complex behaviours with general learning methods that require minimal task specific information. However, imitation learning faces many challenges. The objective of this thesis is to advance the state of the art in imitation learning by adopting deep learning methods to address two major challenges of learning from demonstrations. Firstly, representing the demonstrations in a manner that is adequate for learning. We propose novel Convolutional Neural Networks (CNN) based methods to automatically extract feature representations from raw visual demonstrations and learn to replicate the demonstrated behaviour. This alleviates the need for task specific feature extraction and provides a general learning process that is adequate for multiple problems. The second challenge is generalizing a policy over unseen situations in the training demonstrations. This is a common problem because demonstrations typically show the best way to perform a task and don't offer any information about recovering from suboptimal actions. Several methods are investigated to improve the agent's generalization ability based on its initial performance. Our contributions in this area are three fold. Firstly, we propose an active data aggregation method that queries the demonstrator in situations of low confidence. Secondly, we investigate combining learning from demonstrations and reinforcement learning. A deep reward shaping method is proposed that learns a potential reward function from demonstrations. Finally, memory architectures in deep neural networks are investigated to provide context to the agent when taking actions. Using recurrent neural networks addresses the dependency between the state-action sequences taken by the agent. The experiments are conducted in simulated environments on 2D and 3D navigation tasks that are learned from raw visual data, as well as a 2D soccer simulator. The proposed methods are compared to state of the art deep reinforcement learning methods. The results show that deep learning architectures can learn suitable representations from raw visual data and effectively map them to atomic actions. The proposed methods for addressing generalization show improvements over using supervised learning and reinforcement learning alone. The results are thoroughly analysed to identify the benefits of each approach and situations in which it is most suitable

Open Access Institutional Repository at Robert Gordon University

A Robust Visual Odometry System for Autonomous Surface Vehicles using Deep Learning

Author: Eduardo Jorge Pereira Gonçalves
Publication venue
Publication date: 10/10/2023
Field of study

Repositório Aberto da Universidade do Porto

Learning Dynamic Priority Scheduling Policies with Graph Attention Networks

Author: Wang Zheyuan
Publication venue: Georgia Institute of Technology
Publication date: 10/01/2023
Field of study

The aim of this thesis is to develop novel graph attention network-based models to automatically learn scheduling policies for effectively solving resource optimization problems, covering both deterministic and stochastic environments. The policy learning methods utilize both imitation learning, when expert demonstrations are accessible at low cost, and reinforcement learning, when otherwise reward engineering is feasible. By parameterizing the learner with graph attention networks, the framework is computationally efficient and results in scalable resource optimization schedulers that adapt to various problem structures. This thesis addresses the problem of multi-robot task allocation (MRTA) under temporospatial constraints. Initially, robots with deterministic and homogeneous task performance are considered with the development of the RoboGNN scheduler. Then, I develop ScheduleNet, a novel heterogeneous graph attention network model, to efficiently reason about coordinating teams of heterogeneous robots. Next, I address problems under the more challenging stochastic setting in two parts. Part 1) Scheduling with stochastic and dynamic task completion times. The MRTA problem is extended by introducing human coworkers with dynamic learning curves and stochastic task execution. HybridNet, a hybrid network structure, has been developed that utilizes a heterogeneous graph-based encoder and a recurrent schedule propagator, to carry out fast schedule generation in multi-round settings. Part 2) Scheduling with stochastic and dynamic task arrival and completion times. With an application in failure-predictive plane maintenance, I develop a heterogeneous graph-based policy optimization (HetGPO) approach to enable learning robust scheduling policies in highly stochastic environments. Through extensive experiments, the proposed framework has been shown to outperform prior state-of-the-art algorithms in different applications. My research contributes several key innovations regarding designing graph-based learning algorithms in operations research.Ph.D

Scholarly Materials And Research @ Georgia Tech