Revisiting the Classics: Online RL in the Programmable Dataplane

Pezaros, Dimitrios P.; Simpson, Kyle A.

Revisiting the Classics: Online RL in the Programmable Dataplane

Authors: Dimitrios P. Pezaros
Kyle A. Simpson
Publication date: 1 January 2022
Publisher: 'Institute of Electrical and Electronics Engineers (IEEE)'
Doi

Abstract

Data-driven networking is becoming more capable and widely researched, partly driven by the efficacy of Deep Reinforcement Learning (DRL) algorithms. Yet the complexity of both DRL inference and learning force these tasks to be pushed away from the dataplane to hosts, harming latency-sensitive applications. Online learning of such policies cannot occur in the dataplane, despite being useful techniques when problems evolve or are hard to model.We present OPaL—On Path Learning—the first work to bring online reinforcement learning to the dataplane. OPaL makes online learning possible in constrained SmartNIC hardware by returning to classical RL techniques—avoiding neural networks. Our design allows weak yet highly parallel SmartNIC NPUs to be competitive against commodity x86 hosts, despite having fewer features and slower cores. Compared to hosts, we achieve a 21 × reduction in 99.99th tail inference times to 34 µs, and 9.9 × improvement in online throughput for real-world policy designs. In-NIC execution eliminates PCIe transfers, and our asynchronous compute model ensures minimal impact on traffic carried by a co-hosted P4 dataplane. OPaL’s design scales with additional resources at compile-time to improve upon both decision latency and throughput, and is quickly reconfigurable at runtime compared to reinstalling device firmware

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Enlighten

oai:eprints.gla.ac.uk:262493

Last time updated on 10/01/2022