A Distributed Reinforcement Learning Approach To Maximize Resource


A new scheme to maximize resource utilization in a cellular network while respecting constraints on handover dropping probability is proposed and analyzed. The constraints are set for each traffic class separately and have to be respected by the network independently of the area in a localized manner. The problem is formulated as a Markov Decision Process (MDP) and solved by making use of the model-free simulation-based Q-learning algorithm that runs at each cell. Integration of the handover limit in the model is achieved by observing which of the new call arrivals, at a particular state of the system, are mostly responsible for violation of the handover dropping limit. Through trial and error, the algorithm proceeds to the statistical elimination of new admissions in the system, those causing excessive dropping. Results obtained via the proposed Reinforcement Learning (RL) based approach are compared with a resource allocation that takes into consideration heterogeneous and unevenly distributed traffic over the geographical area under consideration. For the scenarios examined, comparable results and performance are observed with an advantage for RL in blocking and utilization

Similar works

This paper was published in CiteSeerX.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.