In this paper, we investigate the operation of an aerial manipulator system,
namely an Unmanned Aerial Vehicle (UAV) equipped with a controllable arm with
two degrees of freedom to carry out actuation tasks on the fly. Our solution is
based on employing a Q-learning method to control the trajectory of the tip of
the arm, also called end-effector. More specifically, we develop a motion
planning model based on Time To Collision (TTC), which enables a quadrotor UAV
to navigate around obstacles while ensuring the manipulator's reachability.
Additionally, we utilize a model-based Q-learning model to independently track
and control the desired trajectory of the manipulator's end-effector, given an
arbitrary baseline trajectory for the UAV platform. Such a combination enables
a variety of actuation tasks such as high-altitude welding, structural
monitoring and repair, battery replacement, gutter cleaning, skyscrapper
cleaning, and power line maintenance in hard-to-reach and risky environments
while retaining compatibility with flight control firmware. Our RL-based
control mechanism results in a robust control strategy that can handle
uncertainties in the motion of the UAV, offering promising performance.
Specifically, our method achieves 92% accuracy in terms of average displacement
error (i.e. the mean distance between the target and obtained trajectory
points) using Q-learning with 15,000 episode