579 research outputs found
Deep Reinforcement Learning with semi-expert distillation for autonomous UAV cinematography
Unmanned Aerial Vehicles (UAVs, or drones) have revolutionized modern media production. Being rapidly deployable “flying cameras”, they can easily capture aesthetically pleasing aerial footage of static or moving filming targets/subjects. Current approaches rely either on manual UAV/gimbal control by human experts or on a combination of complex computer vision algorithms and hardware configurations for automating the flight+flying process. This paper explores an efficient Deep Reinforcement Learning (DRL) alternative, which implicitly merges the target detection and path planning steps into a single algorithm. To achieve this, a baseline DRL approach is augmented with a novel policy distillation component, which transfers knowledge from a suitable, semi-expert Model Predictive Control (MPC) controller into the DRL agent. Thus, the latter is able to autonomously execute a specific UAV cinematography task with purely visual input. Unlike the MPC controller, the proposed DRL agent does not need to know the 3D world position of the filming target during inference. Experiments conducted in a photorealistic simulator showcase superior performance and training speed compared to the baseline agent while surpassing the MPC controller in terms of visual occlusion avoidance
Learning High-Level Policies for Model Predictive Control
The combination of policy search and deep neural networks holds the promise
of automating a variety of decision-making tasks. Model Predictive
Control~(MPC) provides robust solutions to robot control tasks by making use of
a dynamical model of the system and solving an optimization problem online over
a short planning horizon. In this work, we leverage probabilistic
decision-making approaches and the generalization capability of artificial
neural networks to the powerful online optimization by learning a deep
high-level policy for the MPC~(High-MPC). Conditioning on robot's local
observations, the trained neural network policy is capable of adaptively
selecting high-level decision variables for the low-level MPC controller, which
then generates optimal control commands for the robot. First, we formulate the
search of high-level decision variables for MPC as a policy search problem,
specifically, a probabilistic inference problem. The problem can be solved in a
closed-form solution. Second, we propose a self-supervised learning algorithm
for learning a neural network high-level policy, which is useful for online
hyperparameter adaptations in highly dynamic environments. We demonstrate the
importance of incorporating the online adaption into autonomous robots by using
the proposed method to solve a challenging control problem, where the task is
to control a simulated quadrotor to fly through a swinging gate. We show that
our approach can handle situations that are difficult for standard MPC
- …