4 research outputs found
Deep Reinforcement Learning for URLLC data management on top of scheduled eMBB traffic
With the advent of 5G and the research into beyond 5G (B5G) networks, a novel
and very relevant research issue is how to manage the coexistence of different
types of traffic, each with very stringent but completely different
requirements. In this paper we propose a deep reinforcement learning (DRL)
algorithm to slice the available physical layer resources between
ultra-reliable low-latency communications (URLLC) and enhanced Mobile BroadBand
(eMBB) traffic. Specifically, in our setting the time-frequency resource grid
is fully occupied by eMBB traffic and we train the DRL agent to employ proximal
policy optimization (PPO), a state-of-the-art DRL algorithm, to dynamically
allocate the incoming URLLC traffic by puncturing eMBB codewords. Assuming that
each eMBB codeword can tolerate a certain limited amount of puncturing beyond
which is in outage, we show that the policy devised by the DRL agent never
violates the latency requirement of URLLC traffic and, at the same time,
manages to keep the number of eMBB codewords in outage at minimum levels, when
compared to other state-of-the-art schemes.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl