Reinforcement Learning-based Access Schemes in Cognitive Radio Networks

Abstract

In this thesis, we propose different MAC protocols based on three Reinforcement Learning (RL) approaches, namely Q-Learning, Deep Q-Network (DQN), and Deep Deterministic Policy Gradient (DDPG). We exploit the primary user (PU) feedback, in the form of ARQ and CQI bits, to enhance the performance of the secondary user (SU) MAC protocols. Exploiting the PU feedback information can be applied on the top of any SU sensing-based MAC protocol. Our proposed model relies on two main pillars, namely, an infinite-state Partially Observable Markov Decision Process (POMDP) to model the system dynamics besides a queuing-theoretic model for the PU queue; the states represent whether a packet is delivered or not from the PU’s queue and the PU channel state. The proposed RL access schemes are meant to design the best SU’s access probabilities in the absence of prior knowledge of the environment, by exploring and exploiting discrete and continuous action spaces, based on the last observed PU’s feedback. The performance of the proposed schemes show better results compared to conventional methods under more realistic assumptions, which is one major advantage of our proposed MAC protocols

    Similar works