Active Queue Management (AQM) is a mechanism employed to alleviate transient
congestion in network device buffers, such as routers and switches. Traditional
AQM algorithms use fixed thresholds, like target delay or queue occupancy, to
compute random packet drop probabilities. A very small target delay can
increase packet losses and reduce link utilization, while a large target delay
may increase queueing delays while lowering drop probability. Due to dynamic
network traffic characteristics, where traffic fluctuations can lead to
significant queue variations, maintaining a fixed threshold AQM may not suit
all applications. Consequently, we explore the question: \textit{What is the
ideal threshold (target delay) for AQMs?} In this work, we introduce DESiRED
(Dynamic, Enhanced, and Smart iRED), a P4-based AQM that leverages precise
network feedback from In-band Network Telemetry (INT) to feed a Deep
Reinforcement Learning (DRL) model. This model dynamically adjusts the target
delay based on rewards that maximize application Quality of Service (QoS). We
evaluate DESiRED in a realistic P4-based test environment running an MPEG-DASH
service. Our findings demonstrate up to a 90x reduction in video stall and a
42x increase in high-resolution video playback quality when the target delay is
adjusted dynamically by DESiRED.Comment: Preprint (Computer Networks under review