Landing Throttleable Hybrid Rockets with Hierarchical Reinforcement Learning in a Simulated Environment

Abstract

In this paper, I develop a hierarchical Markov Decision Process (MDP) structure for completing the task of vertical rocket landing. I start by covering the background of this problem, and formally defining its constraints. In order to reduce mistakes while formulating different MDPs, I define and develop the criteria for a standardized MDP definition format. I then decompose the problem into several sub-problems of vertical landing, namely velocity control and vertical stability control. By exploiting MDP coupling and symmetrical properties, I am able to significantly reduce the size of the state space compared to a unified MDP formulation. This paper contains two major contributions: 1) the development of a standardized MDP definition framework and 2) a hierarchical MDP structure that is able to successfully land the rocket within the goal bounds more than 95% of the time. I validate this approach by comparing its performance to a baseline RRT search, underlining the advantages of rapid-decision making compared to online planning in the field of Artificial Intelligence (AI)

    Similar works