Toward an Abstract Model of Programmable Data Plane Devices by Robin, Debobroto Das & Khan, Dr. Javed I.
Noname manuscript No.
(will be inserted by the editor)
Toward an Abstract Model of Programmable Data Plane
Devices
Debobroto Das Robin · Dr. Javed I. Khan
Received: date / Accepted: date
Abstract SDN divides the networking landscape into
2 parts: control and data plane. SDN expanded it’s foot
mark starting with OpenFlow based highly flexible con-
trol plane and rigid data plane. Innovation and improve-
ment in hardware design and development is bringing
various new architectures for data plane. Data plane
is becoming more programmable then ever before. A
common abstract model of data plane is required to
develop complex application over these heterogeneous
data plane devices. It can also provide insight about
performance optimization and bench-marking of pro-
grammable data plane devices. Moreover, to understand
and utilize data plane’s programmability, a detailed
structural analysis and an identifiable matrix to com-
pare different devices are required. In this work, an
improved and structured abstract model of the pro-
grammable data plane devices is presented and features
of its components are discussed in detail. Several com-
mercially available programmable data plane devices
are also compared based on those features.
Keywords Programmable data plane · Abstract
model · Hardware abstraction layer · P4 · Software
Defined Network (SDN)
1 Introduction
From the very beginning, behavior of data plane devices
were rigid and ruled by TCP/IP protocol stack. Differ-
ent efforts [21] for making this rigid environment more
F. Author
Kent State University, Kent, Ohio, USA
E-mail: drobin@kent.edu
S. Author
Kent State University, Kent, Ohio, USA
E-mail: javed@kent.edu
programmable haven’t gain momentum until OpenFlow
protocol [42] came to the theater. But, with it’s ever
growing protocol field set, OpenFlow can not achieve
the goal of real programmable network [5]. For true
network programmability, data plane hardware archi-
tecture should be decoupled from protocol and pro-
grammable in nature. Various technologies [25, 23, 31,
22, 24, 10, 41, 6, 12] have been developed to program
the run time behavior of a data plane device ‘on the fly’.
They are becoming more computationally capable and
various complex application layer processing tasks are
being pushed to data plane [59, 19, 38]. This is enabling
emergence of a new paradigm ‘in-network computation’
[54, 3]. Enabling such cross layer behavior and protocol
independence in data plane makes OSI layer based def-
inition of switch (L2, L3, L4 switch etc.) obsolete. This
raises the necessity of structured discussion on: how to
define programmable switch/data plane device and how
to define it’s programmability features? (Through rest
of the paper, programmable switch and programmable
data plane (PDP) device, these two terms are used in-
terchangeably.)
Programmable switch needs a software stack for pro-
gramming the data plane. Abstraction layer (Device
and resource Abstraction Layer (DAL) [28]) is
one of the key component of these stacks. It provides a
uniform view of the hardware and a convenient way to
develop complex constructs rather than directly deal-
ing with actual hardware instructions. Prominent pro-
grammable data plane technologies have heterogeneous
hardware internals and corresponding hardware abstrac-
tion layer (HAL). They are different from each other
(Fig. 1). But, without a common abstraction layer both
data and control plane application become tightly bound
to target architecture. For example, a program devel-
oped for RMT based architectures can not be executed
ar
X
iv
:2
00
8.
08
69
7v
1 
 [c
s.N
I] 
 19
 A
ug
 20
20
2 Debobroto Das Robin, Dr. Javed I. Khan
OpenFlow DPDK XDP/eBPF ODP VPP P414SAIFboss
Programmable Switch/Data plane Hardware 
OpenFlow 
Logical 
Switch
EAL In-Kernel VM
ODP 
API
Packet 
Processing 
graph
Abstract 
Forwarding 
Model
Switch
API
HwSwi
tch
Abstract 
Model
Fig. 1: Abstract models used by major SDN protocol
(OpenFlow [25]) and programmable data plane tech-
nologies (DPDK [22], XDP/eBPF [31], ODP [23], VPP
[24], Fboss [10], SAI [14], P414 [15])
directly over a DPDK based smart-NIC. Lack of com-
mon abstraction layer increases cost and complexity of
development, testing, performance bench-marking and
formal verification of any novel network functionality
built over it. Besides this, a well designed hardware ab-
straction layer (Logical Forwarding Plane [9]) is center
piece for successful use of network virtualization, net-
work function virtualization and service chain composi-
tion [8]. Though a common abstraction layer over het-
erogeneous hardware architecture has various advan-
tages, but restricting to a single abstract model of hard-
ware closes the door for future innovation in both hard-
ware and abstraction layer design. PDP programming
stacks should be decoupled from hardware architecture
and capable of accommodating multiple hardware ar-
chitecture.
Majority of the data plane programming stacks are
tightly coupled with their own abstraction. Whereas,
P4 (dominant data plane programming language) de-
coupled hardware architecture definition from packet
processing behavior in it’s latest version (P416 [16]).
P416 provides separate language constructs to express
abstract hardware architecture and develop program
based on those architectures. Taking advantage of this
feature, most of the data plane programming stacks
have already developed interface with P4 [11, 62, 60,
51, 14, 56]. Besides this, several other architectures are
also being proposed [17, 27, 12, 52, 30]. PSA [27] is
one of the most matured among them. But majority
of them lacks programmability of two important com-
ponents: buffer and scheduler. Moreover, these hard-
ware architecture definitions and how they process a
packet are described in an informal language. It cre-
ates several issues. Most important among them are-
I1 ) identifying clear definition of various components
of a hardware architecture become hard I2 ) bound-
ary between 2 components and how they connect with
each other become unclear I3 ) exact details of how a
packet is processed inside a component may differ from
one hardware vendor to another I4 ) developing a for-
mal structure or framework for data plane programs
become hard. As consequence of I1 ) and I2 ), modular
hardware/simulator/test-bed design become hard. As
consequence of I3 ) application layer suffers. For exam-
ple, if 2 hardware architecture not agree about packet
processing states inside the components, taking snap-
shot of a network or testing/validating a data plane pro-
gram for multiple architecture become extremely hard.
As a result of I4 ), understanding and comparing var-
ious PDP devices and their programmability features
become difficult. Moreover without a structural notion,
formal analysis of data plane programs become hard.
To tackle these challenges, a generalized but flexible yet
abstract model of the programmable data plane is nec-
essary. It should be modular in nature with well defined
structure of the components. Besides this, well defined
interface among the components are necessary for in-
dependent development and designing new packet pro-
cessing architectures based on these components. More-
over, there should be a high level work flow of the com-
ponents for ensuring uniform behavior across different
hardware architectures.
Considering importance of an abstract model and
current technology landscape, we think it is important
to start discussion on a better model of abstraction
layer for programmable switches. In this work, we have
attempted to shed light on this topic. In doing so, we
make a number of important contributions. After dis-
cussing background and related state of the art (in sec.
2), we provided a unified definition of programmable
switch (in sec. 3) which is independent of any proto-
col stack. Then we presented the design of AVS , an
example model of Device and resource Abstrac-
tion Layer (DAL)/Hardware Abstraction Layer
(HAL) for programmable data plane devices (in sec.
4). It is modular in design and components are defined
with a uniform and structured functional interface. In
discussing details of the components (in sec. 5), at first
a generic structure for the components with well de-
fined type of pogrammability features (Compile Time
Programmability (CTP) and Run Time Config-
urability (RTP)) are laid out. Then a detailed analy-
sis of structure, workflow and progammability features
of each components are discussed (in sec. 5.2 - 6.11).
To present their workflow in a unambiguous and hard-
ware independent manner, we have followed EFSM ap-
proach. After discussing components of AVS in details,
a novel approach to compare programmability level of
selected set of PDP devices based on programmability
featues of AVS components are also presented (in sec.
Toward an Abstract Model of Programmable Data Plane Devices 3
7). We have also presented 4 important use cases of
AVS (in sec. 8).
We do not claim the novelty of underlying com-
ponents design. We are influenced by several existing
work on this domain [6, 26, 43, 36, 39]. We believe
our achievement is providing a broader picture with an
improved and structured abstract model for the pro-
grammable switches.
2 Background and Related Works
OpenFlow [42] protocol is the most successful name in
SDN paradigm. It is described over an abstract model
of switch named ‘OpenFlow Logical Switch’ [25].
OpenFlow decoupled control plane (CP) and data plane
(DP). It provides limited programmability in data plane
by controlling DP behavior from CP through dynami-
cally configuring a broad set of protocol header fields.
Soon academia and industry realized that, with it’s ever
growing protocol field set OpenFlow can not achieve the
goal of true programmable network. True potential of
SDN can be leveraged only if data plane is fully pro-
grammable.
After the initial wave of OpenFlow based switches,
work on programmable data plane device has gained
significant momentum [4, 33]. The first consolidated ef-
fort for data-plane programmability can be attributed
to RMT [6], which proposed an architecture for pro-
grammable data plane devices. Based on RMT’s archi-
tecture, an ‘abstract forwarding model’ for data plane
devices has been presented in [5]. In this work, authors
have also presented a data plane programming language
named P4 [5] to write program for RMT based pro-
grammable switches. Based on success of RMT and P4,
several commercial programmable data plane devices
has been emerged in the market [49, 46]. In initial ver-
sion (version 14) of P4 (P414) [15], data plane programs
(DPP) were developed based on ‘abstract forwarding
model’ of [5]. P414 was strictly coupled with the pro-
posed abstract forwarding model. Although P414 came
out with an abstract model and related programming
construct to program them, it’s high cohesion with pro-
posed hardware with limited programmability made it
unsuitable for design of new data plane hardware ar-
chitectures. This clearly shows the requirement of dis-
aggregation between hardware architecture and pro-
gramming language for programmable data plane de-
vices.
Currently RMT is the dominant programmable data
plane architecture. But it has few limitations. each pipeline
stage of RMT architecture can only access memory allo-
cated for it. As a result cross component access of mem-
ory/data is also not possible [12]. Moreover, as compo-
nents of RMT pipeline are connected linearly, a packet
can not skip unnecessary stages in the pipeline. To over-
come these limitations, authors have proposed dRMT
architecture in [12]. It improves RMT architecture by
dis-aggregating memory and compute resources inside
switch. In [52] authors presented a conceptual model of
data plane for supporting parallel processing. It is heav-
ily based on RMT and dRMT architecture. But these
works, do not provide concrete hardware abstraction
layer.
Efforts for making server based networking envi-
ronment more programmable have also seen massive
growth. DPDK [22] and eBPF/XDP [31] are the two
major framework for userspace and in-kernel packet
processing. They rely on two abstraction layers: EAL
for DPDK and in-kernel virutal machine for eBPF/XDP.
Though they are able to work in conjunction with smart-
NIC but they are not suitable for core switches. Another
important category of programming stack is API based
abstraction layer [23, 41]. Instead of providing an ab-
straction layer they provide an API through which data
plane can be programmed. These APIs not provide any
guideline about how to implement a feature. All con-
forming hardware vendors provide their own implemen-
tation of the API. As a result I3 (sec. 1) is not solved
by API based solutions. Moreover to interface with data
plane programming language (ex. P4), it needs more
than one step. At first step, P4 codes are transformed to
abstract API calls and then API calls are transformed
to hardware specific instructions. Thus this style ab-
straction increases complexity.
Among all those data plane programming technolo-
gies, P4 and relevant tool-sets emerged as the most
dominant programming stack. Nearly all the major pro-
grammable data plane platform have a P4 implementa-
tion or P4 interfacing [11, 62, 60, 51, 14, 56]. As P4 is
becoming more matured, several works [32, 20, 29, 64,
40, 50, 1, 51, 30] focusing on various high level aspects
of data plane is rolling out. But, as all of these works
depends on P4, they are limited by the abstract for-
warding model of P414 or few of the present hardware
model [17] supported by P416.
Initial version of P4 (P414) was strictly coupled with
the proposed ‘abstract forwarding model’ and RMT ar-
chitecture. But later, P4 community have taken a clean
state approach and followed the proven path of vir-
tual machine world. In current version (version 16) of
P4 (P416) [16] language constructs for expressing data
plane device architecture and hardware specific features
are decoupled from the constructs for expressing packet
processing logic. Currently in a P416 based data plane
program, developer needs to define a model of hardware
(in form of include file) and the actual packet processing
4 Debobroto Das Robin, Dr. Javed I. Khan
logic separately. This decoupling gives 4 crucial advan-
tage: a) new hardware architecture can be supported
any time b) common hardware architecture or abstrac-
tion layer can be created by all relevant stake holders c)
top down design of programmable data plane hardware
become possible d) high level language constructs for
packet processing can be developed independently from
hardware design . As a result, P4 can keep the path
open for hardware innovation yet promoting interoper-
ability through common hardware architectures. On the
other hand P416 enables writing data plane behavior in
a flexible and portable manner based on abstraction
layers.
P4 community is actively working toward develop-
ing a common hardware architecture that can cover
various programmable data plane devices ranging from
core switches to smart-NIC. PSA [27] is the most ma-
tured attempt from P4 community toward that goal. It
aims to list several common packet processing paths in-
side programmable switch and smart-NIC. These paths
are composed from multiple programmable P4 blocks.
It also lists several related stateful data structure and
functions for use in data plane. PSA can be identified as
the first structured effort toward a standard abstraction
layer of data plane. But PSA specification describes the
components and their work flow in a descriptive way. As
a result there is always chance of confusion among vari-
ous implementation of PSA and it suffers from problem
I1,I2,I3 (section 1). Besides, PSA has no option of
programmablity for buffer [39] and scheduler [43, 57]
in the architecture. In this work, we intend to over-
come these limitations by proposing a better abstract
model (AVS) for programmable data plane devices by
extending PSA architecture.
Our approach is aligned with P4 community, as the
proposed abstract model (AVS) is inspired by PSA and
P4. It also enhances the PSA architecture, firstly by
adding programmability to buffer and scheduler com-
ponents. Secondly, by providing a common functional
structure for each components and expressing work flow
of each components as hardware architecture agnostic
extended finite state machine. These enhancements can
help in designing various innovative applications (sec. 8
[53])). Moreover, P4 doesn’t impose any restriction on
use of hardware architecture and it has decoupled hard-
ware dependent compiler back end for various architec-
tures. Hence, use of AVS like abstract model doesn’t
restrict use of other hardware architectures and the
scope of innovation in hardware design remains wide
open.
3 Programmable Switch
A packet (PKT )’s life inside a switch starts with reach-
ing through incoming (ingress) port as few bits of data
and ends with exiting through outgoing (egress) port.
PKT = {Bit1, Bit2, Bit3, ............, BitPacket Length}
How a data plane device acts is defined by
– S1- Interpreting Packet: how to interpret incom-
ing set of bits (PKT ) as different meaningful fields.
– S2- Packet Metadata: a data plane device keeps
few hardware dependent information about a packet
(metadata) for use in different stages of the packet’s
life cycle. Example: arrival time of a packet, this is
necessary for packet scheduling.
– S3- Packet Processing Work-flow: Set of oper-
ations executed based on different fields of packet
and various data structures.
– S4- Control Plane Configuration Parameters:
data plane lacks of global knowledge about the net-
work. Control plane needs to configure parameters
so that S3 can be adapted with dynamic network
conditions.
– S5- Emitting Packet:What’s the structure of the
packet emitted by the device.
In legacy switches, S1-S3 & S5 are static. Though
S4 is available, they are rather configuring parameters
for fixed S3 . Hence, in legacy switches, once S1-S5
are loaded, they can’t be changed. These switches are
optimized for specific protocol. But in programmable
switch, S1-S5 are not static and they can be modified
at switch lifetime. Formally, any combination of hard-
ware and/or software is a programmable switch or
programmable data plane device , if it fulfills fol-
lowing properties
– P1: Not bound to any specific protocol
– P2: Can execute any program, which
– P2-1: contains logic and interface for S1-S5
– P2-2: is not coupled to hardware architecture
– P2-3: can be loaded and unloaded at runtime
Any logical abstraction that can provide an uniform
view and functionality of a ‘programmable switch’ can
be termed as an abstract model of programmable
switch.
4 The Abstract Model (AVS)
Processing path of a packet inside a programmable switch
can be modeled as an abstract pipeline of serially con-
nected programmable components. Our proposed pipeline
based abstract model of programmable switch (‘Abstract
Toward an Abstract Model of Programmable Data Plane Devices 5
Sched-
-uler
(S)
SDS
SDS
Programmable Data Plane Devices 
Data 
Plane
Program  
(DPP)
Abstraction
Layer 
(AVS)
Hardware
Layer
Control 
Plane  
Egress
Deparser
(DPRE)
Buffer & Replication 
Engine
(BRE)
Port 1’s Buffer 
Port  N’s Buffer
Egress Stage
Egress
Parser
(PRE)
Ingress
Parser
(PRIn)
Ingress
Deparser
(DPRIn)
Ingress Stage
Ingress 
Buffer 
Engine
(Position 
1)
(BE1In)
Buffer
Buffer
Ingress 
Buffer 
Engine
(Position 
2)
(BE2In)
Buffer
Buffer
Ingress Match 
Action Unit
(MAUIn)
Schedu-
ling 
Algorithm
Parameter
(SParam)
Egress 
Parse
Graph
(𝑮𝑬
𝑷)
Ingress
Parse
Graph
(𝑮𝑰𝒏
𝑷 )
Igress 
Deparse
Graph
(𝑮𝑰𝒏
𝑫 )
Buffer 
Config. 
Table
(BCT)
Ingress
Match
Action 
Table
(MATIn)
Many
cast 
Group 
Table
(MGT)
Egress
Deparse
Graph
(𝑮𝑬
𝑫)
Egress 
Match
Action 
Table
(MATE)
Header 
Definition
(HDefinition)
+
Packet 
Metadata
(PKTMetadata)
Ingress 
Match-
Action 
Graph
(𝑮𝑰𝒏
𝑴𝑨𝑼)
Egress 
Match-
Action 
Graph
(𝑮𝑬
𝑴𝑨𝑼)
Buffer 
Param. 
Table
(BPT)
Buffer 
Config. 
Table
(BCT)
Buffer 
Param. 
Table
(BPT)
Insert & 
Remove 
Interface
Buffer 
Config. 
Table
(BCT)
Buffer 
Param. 
Table
(BPT)
Centralized Or Distributed SDN Controller
Egress Match 
Action Unit
(MAUE)
Ingress 
Port
(PortIn)
Ingress 
Port 1
Ingress 
Port N
Egress 
Port 1
Egress 
Port N
Egress 
Port
(PortE)
Fig. 2: An Abstract Model for Programmable Switch
Symbol Meaning
Simple State: A state representing a stable
situation of the workflow.
Complex State: A state which refers to an
entire State Machine. This sub state machine
is shown as separate state machine in next
parts.
Final State: A special kind of State (not
Pseudo State) represents “a workflow has
been completed”.
Trigger: An event causing transition from
one state to another.
Guard: A Boolean expression, which
enables the trigger when evaluated TRUE.
Behavior-expression: An expression
specifying what happens when transition
occurs.
Output: is any kind of data structure or
value that is passed as result of trigger
action to next state
If any of these 4 is not present for a
transition, it is marked
as “-”.
transition = 
[ trigger ]
[ guard ]
[ behavior-
expression ]
[ output ]
Complex 
State 
Simple 
State 
Final
State 
State  1
State  2
Fig. 3: EFSM Notations
Virtual Switch (AVS)’ ) is represented along with
other components of SDN stack in Fig. 2. Components
(C) of AVS are a) ingress port (PortIn) b) ingress
parser (PRIn) c) ingress buffer engine (BEIn) d) ingress
match action unit (MAU In) e) ingress deparser (DPRIn)
f) buffer and replication engine (BRE) g) egress parser
(PRE) h) egress match action unit (MAUE) i) egress
deparser (DPRE) j) scheduler (S) k) egress port (PortE).
Based on egress port selection for a packet, theAVS
pipeline is logically divided into 2 stages: ingress stage-
before selecting the egress port and egress stage- af-
ter selecting egress port. After entering egress stage a
packet’s egress port can not be changed. This is par-
ticularly important when packets are replicated (clone,
broadcast, multicast etc.) in egress stage. On such cases,
packets may require further match action processing
in egress stage. For example, traffic rate controlling at
each outgoing port requires action at egress stage. If
egress processing is not required in a hardware imple-
mentation, vendors may skip that part.
AVS may represent a single non virtualized pro-
grammable switch or a slice in a virtualized programmable
switch or just a software switch. It is compiler’s duty
to map components of AVS to actual hardware re-
sources. AVS provides a uniform view of data plane
over heterogeneous programmable switches. How the
data plane will behave is defined by data plane pro-
gram (DPP) (Fig. 2). Management plane handles
(un)loading of DPP. On the other hand control plane
controls runtime behavior of AVS by configuring pa-
rameters.
6 Debobroto Das Robin, Dr. Javed I. Khan
5 AVS Components
5.1 Generic Structure Of The Components
Each component (C) of AVS represents a programmable
unit and they are needed to be programmed from out-
side before a packet processing starts. These are sup-
plied as DPP. On the other hand CP controls each com-
ponent’s behavior by configuring parameters through
southbound interface. Degree of programmability of AVS
and it’s components depend on following 2 kind of fea-
tures.
– Compile Time Programmability (CTP) Fea-
tures: Set of instructions a programmable compo-
nent can execute (comparable to cpu instruction
set). How C will behave at run time is defined through
these features.
– Run Time Configurability (RTC) Features:
Capability of adjusting run-time behavior of CTP
features through configuring parameters. Control plane
uses these to manage the behavior of a component
C.
Irrespective of hardware implementation, CTP and
RTC features can be exposed to upper layer as an uni-
form API. Compiler translates these API call to ac-
tual hardware instruction. DPP is a program expressed
through CTP features and contains runtime processing
logic of a component C. And control plane application
controls behavior of those processing logic at run time
through RTC features. DPP also contains data struc-
tures for facilitating control plane communication with
C through RTC features.
Formally, a component (C) can be represented as a
component function fc,
fc : I → O fc(X,ProcLogic, Confparam) = Y (1)
Here,
– I is the domain of fc, it represents the set of all
possible values that C accepts.
– O is the co-domain (range) of fc, it represents the
set of all possible values that C can return as output.
– ProcLogic is the processing logic to be executed by
the component. It is represented using CTP , RTC
and control flow
– X is the input to the component, X ∈ I
– Y is the output of the component, Y ∈ O. If C
modifies X and returns the result as variable of X,
Y = X∗ notation is used. Ex. ingress match action
unit (sec. 6.4)take PHV as input and returns result
as a modified PHV .
– Confparam is the set of parameters CP can config-
ure to control the behavior of the component at run
time. This is a subset of CTP features
Table 1: Complex state of Fig. 4 and corresponding
sub-state-machines
State Sub statemachine Section
S2,S5, S10,S11 Fig. 8 Section 6.3
S3,S6,S12 Fig. 9 Section 6.3
S4,S13 Fig. 7 Section 6.2, 6.7
S7,S14 Fig. 10 Section 6.4, 6.8
S8 ,S15 Fig. 11 Section 6.5, 6.9
In next few subsections, each component of AVS
are discussed in details. How they work and their pro-
grammability features are also discussed. For expressing
work-flow of the components extended finite state ma-
chine (EFSM) approach is used. In figure 4, an EFSM is
presented to define how a packet (PKT ) goes through
different components of the AVS. Separate sub state
machines are used when necessary. Table 1 contains
complex states of this state machine (figure 4) and ref-
erence to corresponding sub-state-machine with section
number where the components are discussed. Notations
used in the EFSMs are described in Fig. 3. Parser,
match action unit and deparser exists both in ‘ingress’
and ‘egress’ stage. Their internal structure and parame-
ters structures are same for both the stage but parame-
ter names are different. To differentiate between ingress
and egress stage components and parameters In and E
subscripts are used respectively.
5.2 Related Terminology
Before driving into details of each component, few rel-
evant terminologies and notations are discussed in this
section.
5.2.1 Bit Space ( BS)
Upon receipt a packet (PKT ) is a sequence of one and
zero.
PKT = {Bit1, Bit2, Bit3, ............, BitPacket Length}
Formally, a packet (PKT ) is a point in the space
BS = {0, 1}MaximumPacket Length.
5.2.2 Header Definition (HDefinition)
To do meaningful operation on a PKT , it is needed to
be interpreted as fields of different protocols. Header
definition provides structure of these fields. Definition
of i’th header field
Hif = (unique ID/name, starting position in packet, length)
Toward an Abstract Model of Programmable Data Plane Devices 7
[ PHV Sent to   BEIn
2] [ - ]
[ Receiver Thread  
processes  PHV ] [ - ]
S12: PHV 
Removed from 
Bp
[ Sender Thread peeks PHV
from BP ]  [ - ] [ Remove 
PHV from BP ]  [ PHV]
S1: PKT 
Received
[ PHV Sent to   BEIn
1] [ - ]
[ Receiver Thread  
processes  PHV] [ - ]
[ Sender Thread peeks PHV
from BEIn
1 ] [ - ] [ Remove 
PHV from BEIn
1 ]  [ PHV]
[ PHV Sent to PRIn ] [ - ]
[ load Gp
In & parse 
PKT ] [ PHV* ]
[ PHV Sent to BRE ]
[ PHV.Egress_Port == 
Drop Port ]   [ Drop PHV
] [ - ]
[ Send PHV to S ] [ PHV ‘s Validity  == 
FALSE  ] [ Drop PHV] [ - ] 
[ S Selects  PHV for 
Transmission] [ - ]
[ Call  Remove() interface of 
SDS ] [ PHV ]
[PKT Arrives To PortIn]  
[ - ] [Create PHV & Save 
PKT to PHV.data_buffer ] 
[ PHV]  
S2: PHV Placed 
in  BEIn
1
S8: PHV 
Deparsed in 
Ingress
S3: PHV 
Removed     
from  BEIn
1 
S6: PHV 
Removed from 
BEIn
2 
S5: PHV Placed 
in BEIn
2 
S7: Ingress MA 
Completed
S10: Copy of PHV 
Placed on Port 
Buffer Bp
[PHV Sent to BRE]
[PHV.Egress_Port == 
Manycast Port] [ Copy 
of PHV Created for 
Each Port  Of Manycast
Group & Placed on 
Each Port’s Buffer]
[ PHV ]
S11: PHV Placed 
on Port Buffer 
Bp
[ PHV Sent to BRE]
[PHV.Egress_Port == Unicast 
Port] [ Place PHV
on Bp ] [ PHV ]
S13: PHV 
Parsed in PRE
[ Send PHV to S ] [ PHV ‘s Validity 
== TRUE] [ Call Insert(PHV) interface of 
SDS ] [ - ]
[ PHV sent to MAUe ]  
[ - ] [ load GeMAU
& Do Match-Action 
processing ]  [ PHV*]
[ PHV Sent to DPRE ] [ - ]
[ Store Deparsed data in 
PHV.data_uffer] [ PHV* ]
S14: Egress MA 
Completed
S15: PHV 
Deparsed in 
Egress
S16: PHV 
Inserted in SDS
S17: PHV 
Removed from 
SDS
[ Transmit PHV ] [ - ]
[ Emit PHV.data_buffer
contents to medium]
[ Signal]
S4: PHV Parsed 
in PRIn
[ Sender Thread peeks PHV from BEIn
2 ] 
[ - ] [ Remove PHV from BEIn
2 ]  [ PHV]
[ PHV Sent to PRE ]
[ - ] [ PARe parses 
PHV ]  [ PHV ]
[ Sender Thread peeks PHV
from BP ]  [ - ] [ Remove 
PHV from BP ]  [ PHV]
S18: PHV 
Transmitted
[ PHV sent to MAUIn ]  [ - ]
[ load GInMAU & Do Match-
Action  processing ]  [  PHV* ]
S19: PHV 
Dropped after 
Egress Stage
S9: PHV Dropped 
after Ingress Stage
[ PHV Sent to DPRIn ] [ - ]
[ Store Deparsed data in 
PHV.data_buffer ] [ PHV* ]
Fig. 4: Packet’s Life Cycle EFSM
1 740 2 3 5 106 8 9 11
Destination MAC Source MAC
VLAN Tag      Total Length ………………………………………………
Payload
…………………………………………….……………………………………………......
Proto Type
Fig. 5: A sample packet format
Here,
(1 ≤ i ≤ p) and
p = total number of fields in HDefinition
Starting position of Hif provides relative order of the
field in packet. All Hif together from packet header,
PKTheader =
p⋃
i=1
Hif
Rest of the packet is considered as payload.
5.2.3 Packet Metadata (PKTMetadata)
Depending on actual hardware implementation, switches
maintain some metadata (PKTMetadata) about a packet
(ingress or egress port, time of arrival, packet unicast
or multicast type etc.). I’th metadata field
M if = (uniqueID/name, data type/length)
Here,
(1 ≤ i ≤ q) and
q = total number of fields in PKTMetadata
All M if together from packet metadata,
PKTMetadata =
q⋃
i=1
M if
A sample packet format is shown in Fig. 5, correspond-
ing header definition & a packet metadata are shown in
Fig. 6. (a).
5.2.4 Packet Header Vector (PHV) and Space (PHVS)
Header definition and metadata defines a p+ q dimen-
sional space named Packet Header Vector Space (PHVS).
Each point in this space is termed as packet header
vector (PHV). PHV can be considered as a container
for all the attributes of a packet in key-value format.
Where, key represents a header field (Hf ) or metadata
field (Mf ) or any other field derived in the pipeline and
value represents corresponding data.
PHV = PKTheader ∪ PKTMetadata
8 Debobroto Das Robin, Dr. Javed I. Khan
Header Definition 
< Dest MAC, 0 , 6 Bytes >
< Source MAC, 6 , 6 Bytes >
< VLAN Tag, 12, 2 Bytes >
< Total Length, 14 , 3 >
< Proto Type(IPv4/IPv6), 13, 1 >
< Payload …...> 
Packet Metadata 
< Ingress Port, 2 Bytes >
< Egress Port – 2 Bytes  >
< Arrival Time – 8 Bytes >
< Unicast/Manycast Type – 1 Byte >
< Manycast Group ID – 1 Byte >
< Scheduling Order – 2 Bytes >
Source MAC 
VLAN Tag 
Proto Type
Starting Position – 15
Length – 8 bit 
Parse Table :
IPv4 IPv6
Value Next Node
4 IPv4
6 IPv6
Dest MAC 
Total Length
(a) (b)
Fig. 6: (a) Header definition and metadata for the sam-
ple packet of Fig. 5 (b) Parse graph for sample packet of
Fig. 5 (only "Proto Type" node is shown in expanded
format)
Each PHV represents a point and flow represents a re-
gion in PHVS. In a PHV, all the fields in definition may
not be present at some point in AVS. They may be filled
up by different components in pipeline at different stage
of the packet’s life-cycle.
5.2.5 Ordered PHV Set (PHV set, <)
Let PHV set is a set of PHV . An ‘Ordered PHV Set’
is an ordered pair (PHV set,≤) of set PHV set and the
binary relation ≤ contained in PHV set×PHV set, such
that
– Reflexive: ∀PHV ∈ PHV set : PHV ≤ PHV
– Transitive: ∀PHV i, PHV j , PHV k ∈ PHV set :
[(PHV i ≤ PHV j) && (PHV j ≤ PHV k)]
⇒ ((PHV i ≤ PHV k))
– Anti-symmetry: ∀PHV i, PHV j ∈ PHV set :
[(PHV i ≤ PHV j) &&(PHV j ≤ PHV i)] ⇒
((PHV i = PHV k))
Here the relation is defined based on one or more fields
of PHV such that, for a field x ∈ PHV and 2 elements
of the set PHV m, PHV n ∈ PHV set, (PHV m.x.value ≤
PHV n.x.value)⇒ (PHV m ≤ PHV n) . If (PHV m.x.value
== PHV n.x.value) then another filed y ∈ PHV can
be used to break tie. As packets received from each port
have distinct arrival time, and each port has different
number in a switch, a total order on a set of PHV is
always possible.
Ni = (Hf
i
, L(Hf
i), 
T(Hf
i)  Loaded
Hf
i Extracted
Next Node Ni+1
Selected
Parse 
Completed
[ Extract Field ] [  Current Position + 
L(Hf
i) <= Total Length of 
PHV.data_buffer ]  
[ Store extracted value in PHV. Hf
i ] [ - ]
Hf
i Extraction 
Failed[ Select Next Node N
i+1 ]   [ - ]  
[ Select child of Ni from T(Hf
i)
Matching  with Hf
i & set as 
Ni+1 ] [ - ]
[ Load Next Node of 𝑮𝑰𝒏
𝑷 ] 
[ Ni+1 != Null ]  [ - ]  [- ]
[Mark PHV as invalid] [ - ] [ - ]  
[ Parsing Completed ] [PHV  ]
PArser
[Load 𝑮𝑰𝒏
𝑷 ] [ - ] [ Ni=0  loaded to 
PRIn & set Current_Position = 0 ] 
[ - ]
[ Load Next Node of 𝑮𝑰𝒏
𝑷 ] 
[ Ni+1 == Null ]  [ - ] 
[ Mark PHV as valid ] 
[ PHV ]
[ PHV Sent to PRIn ]
[ - ] [ - ] [ -]
Parsing Started
[ Extract Field ] [  Current 
Position + L(Hf
i)  >  Total 
Length of 
PHV.data_buffer ]  
[ - ] [ - ]
Fig. 7: Programmable Parser EFSM
Header Definition &Packet Metadata (PKTMetadata)
together forms PHVS. Metdata is fixed for an architec-
ture. PHVS is mainly dependent on Header definition.
Nearly all the component’s of AVS domain and range
is PHVS. Hence it is a crucial part of DPP.
6 Analysis of Components
6.1 Ingress Port (PortIn)
In AVS , sole task of an ingress port (PortIn) is to
receive a set of bits (PKT ) and store as a PHV. How
a hardware level frame is received and format of the
frame is out of the scope of our discussion. After re-
ceiving, a new PHV is initiated and PKT is stored in
data_buffer variable of PHV. data_buffer is a stor-
age for an array of bits. In this phase, necessary meta-
data (ingress port, arrival time etc.) are also stored in
the PHV. Then the PHV is passed to next component
in pipeline. In AVS, ingress ports do not have any kind
of programmability in terms of CTP and RTC features.
fPortIn : BS → PHV S
fPortIn(PKT,Null,Null) = PHV,
Toward an Abstract Model of Programmable Data Plane Devices 9
6.2 Ingress Parser (PRIn)
PRIn parses array of bits stored in PHV.data_buffer
to different header fields. Parsing logic is provided through
parse graph (GpIn). Each node (N
i) ∈ GpIn contains: a)
Header field information: Hif ∈ PHV to be parsed,
starting position in PKT and length of (L(Hif )) and
b) Parse table [26] ( (T (Hif )): Lists possible values of
Hif and corresponding next node. Work-flow of a pro-
grammable parser is shown in Fig. 7 as EFSM. After
parsing, PHV is passed to next component in pipeline.
fPRIn : BS → PHV S
fPRIn(PHV.data_buffer,G
p
In, Null) = PHV
∗
CTP features: a) maximum length limit: maxi-
mum length of PKT that can be parsed by PRIn.
This limit is important for bounded runtime. b) sup-
ported data types: different data types (including all
the standard primitive data types: int, float, char etc)
that can be supported by the parser c) granularity of
field parsing: can the parser circuit parse individual bit
or variable number number of bits to a field.
RTC features: a) modifiability of GpIn : is the parse
graph (GpIn) and it’s nodes contents are modifiable at
runtime by CP.
6.3 Ingress Buffer Engine (BEIn)
General role of a buffer is to temporarily hold pack-
ets. Though a programmable buffer can be expressed
under match-action semantics, but its significance in
various important applications [39, 37] warrants a sepa-
rate component for buffer. Ingress buffer engine (BEIn)
consists of a set of buffer (Bset). Assuming n individual
buffers (Bi) in the engine, Bset =
⋃n
i=1Bi. Their size
may be fixed or programmable. CP can control these
sizes through configuring ‘Buffer Parameter Table
(BPT)’ (Table 3). There are 2 possible positions of
BEIn in AVS. Any one or both can be used.
First one (BE1In) is just after the PortIn. BE
1
In
holds PHV received from ports. CP controls PHV from
which ingress port should be stored to which buffer
through ‘Buffer Configuration Table (BCT)’ .
Second one (BE2In) is after ingress parser. This is
a more generalized and useful implementation which
can store PHV to a buffer based on PHV fields. Here,
instead of only ingress port, CP configures based on
which header/metadata field, a PHV should be sent to
which buffer through ‘Buffer Configuration Table
(BCT)’ (Table 2). Priority for matching PHV’s also
can be assigned through BCT .
Table 2: Buffer Configuration Table (BCT) for sample
packet of Fig. 5
PHV field Name PHV field Value Buffer ID Priority
VLAN Tag 0x 00 15 25 3 1
VLAN Tag 00 45 25 6 0
· · · · · · · · · · · ·
Table 3: Buffer Parameter Table (BPT) for sample
packet of Fig. 5
Buffer ID Size RX TX
1 2048 True False
5 3072 False True
· · · · · · · · · · · ·
Buffer engine’s work-flow can be expressed as 2 threads:
a) Receiver Thread (Fig. 8): for inserting PHV in
buffers and b) Sender Thread (Fig. 9 ): for mov-
ing out PHV from buffers and sending to next com-
ponent in pipeline. These 2 threads behavior are con-
trolled by 2 separate state variable for each of the buffer
(Bi ∈ BSet). These are configurable from control plane
via ‘Buffer Parameter Table (BPT)’ . These 2 are:
a) RX mode: when RX of a Bi is true, receiver thread ei-
ther receive PHV and store them (from port or ingress
parser) or drop them if false b) TX mode: hold flow
from buffer (buffer in pause state) when TX is false
or release (buffer in resume state) packet/PHV to next
component in pipeline when TX of the buffer (Bi) is
true.
Receiver thread of buffer engine can be expressed as
fBEreceiver : PHV S → Bset
fBEreceiver (PHV,Null, {BPT,BCT}) = Bset; (PHV
inserted in Bset)
and sender thread as (assuming a round-robin order
for selecting next buffer from where next PHV will be
peeked)
fBEsender : Bset → PHV S
fBEsender (Null,Null,Null) = PHV
CTP features: a) buffer size & number control-
ling: number of buffers and corresponding size are pro-
grammable or not b) BCT creation: is BCT fixed or
can be declared (size and definition) in compile time c)
global access of buffer property: can other components
of AV S access buffer properties?
RTC features: a) BPT modifiability: can CP ma-
nipulate buffer size through BPT b) BCT modifiabil-
ity: can CP add, modify or delete entry in BCT table.
10 Debobroto Das Robin, Dr. Javed I. Khan
[ PHV Sent to BEIn ]
[ - ] [ - ]   [ - ]
Bi Selected
Bi Ready
Bi is Full
[ Create Space for PHV ] 
[ Old Priority  PHV  Drop 
Allowed = = TRUE ] 
[ Old/Low Priority PHV
Dropped ]
PHV  Placed 
in Bi
PHV Validated
[ PHV Validity  Check ] [ PHV
validity == TRUE  ] [ - ]  [ - ]
[ Select Matched Buffer from 
BCT ] [ - ]  [ PHV field 
matched with  BCT ] [ - ]
[ Check Bi‘s RX Mode ] [ Bi !=
Null && Bi’s  RX_mode
== TRUE] [ - ] [ - ]
[ Check Bi ’s Empty  Space ]
[Bi is_Full= TRUE ] [ - ] [ - ]
Dropped Before 
Buffered
[ Create Space for  PHV ] [ 
Old/low Priority  PHV Drop 
Allowed = = FALSE ] [  Current 
PHV Dropped ] [ - ]
PHV Arrived to 
BEIn
Space 
Available in Bi
[ Check Bi ’s Empty 
Space ] [ Bi is_Full= 
FALSE ] [ Store 
PHV ] [ - ]
[ Check Bi’s RX Mode ] [Bi = Null 
|| Bi’s   RX_mode == FALSE ] 
[ Drop PHV] [ - ]
[ Check Bi ’s Empty 
Space ] [ Bi is_Full= 
FALSE ] [Store 
PHV] [ - ]
[ PHV Validity  Check ]
[ PHV validity == FALSE  ] 
[ Drop PHV ]  [ - ]
PHV Dropped 
Before Buffered
Fig. 8: Buffer Receiver Thread EFSM
[ Peek  PHV from BEIn ] [ - ] [ Select Bi from 
BSet in Round-Robin Order  ] [ - ]
BEIn Ready to 
Peek PHV from 
Bi
Bi  Resumed
[ Check Bi’s TX 
Mode ] [ TX  mode 
== TRUE ] [ - ] [ - ]
Bi  Paused
[ Check Bi’s TX Mode ]
[ TX  mode == FALSE ] 
[ - ] [ - ]
[ Bi’s TX Mode 
Changed by CP or 
DPP ]  [ TX Mode 
== TRUE] [ Bi
Resumed ] [ - ]
[ Bi’s TX Mode Changed
by CP or DPP ] 
[ TX Mode = TRUE]
[ Bi Resumed ] [ - ]
PHV Removed 
from Bi
[Remove oldest PHV from
Bi ]  [ - ]   [ - ] [ PHV ]
[ Bi’s TX Mode Changed
by CP or DPP ] 
[ TX Mode == FALSE]
[ Bi Paused ] [ - ]
[ Bi’s TX Mode Changed
by CP or DPP ] 
[ TX Mode == FALSE]
[ Bi Paused ] [ - ]
Fig. 9: Buffer Sender Thread EFSM
6.4 Ingress Match Action Unit (MAU In)
This component can be considered as the computa-
tional unit of a data plane device. What is cpu in a
server, match-action unit can be considered as that
unit for programmable data plane devices. Increasingly,
more complex and generalized processing units (FPGA,
CPU, GPU) are being proposed as the main compu-
tational unit for PDP devices. but match-action base
Table 4: A sample MAT for ‘Proto Type’ field in sample
packet of Figure 5
Match Type Values to be Matched Action[s]
Exact 4 Increase IPv4 counter
Exact 6 Drop Packet
· · · · · · · · ·
hardware are still dominating. In this work, we have ex-
pressed the generic packet header based computations
in match-action semantics. hardware implementations
may use CPU/GPU/RMT circuit for implementing the
actual logic. But the concept is generalized for the com-
ponent.
In this component, values of PHV field or any other
data derived from them are matched with either a)
control plane configured data or b) data collected by
data plane itself. Based on matching result different ac-
tions are executed. Control plane configured data are
kept in a table like data structure. Control plane can
store, modify or delete data from these tables at run-
time through southbound API. Data plane’s collected(or
derived) data can be of 3 types a) stateful information
about a flow(meter, register, counter etc.) b) stateless
metadata about a packet and c) any constant value sup-
plied at compile time in DPP.
Programmability and performance of a switch mostly
depends on the set of actions it allows programmer to
use. Actions can be of different types : stateless - only
access and modify current PHV fields, stateful - access
and update previously stored data about a flow and use
them to to update PHV.
Different data plane programming language may rep-
resent match action semantics in different syntax. But
fundamentally, match-action block requires 4 informa-
tion, a) name/id of the PHV field which will be matched/compared
b) control or data plane supplied value, against which
PHV field values are matched or compared c) match-
ing/comparison method (exact, ternary, <, > , != etc.)
d) one or more action(action block) to execute based
on comparison result.
Without loosing generality, here we assumed that,
storing computational logic for packet processing can
be represented as graph. In current literature, generic
data structure for storing processing logic information
is termed as Match − Action − Table(MAT ) [6]. Pro-
cessing logic for a match-action-unit can be represented
as a match-action graph (GMAUIn ), where each node(N
i)
represents match-action logic for a specific field in PHV
and edges are control flow. Let, a protocol field p_f ∈
PHV , it’s value in PHV is p_f.value and data struc-
ture for storing it’s match-action information isMAT p_f .
Toward an Abstract Model of Programmable Data Plane Devices 11
MAT p_f = {match type × values to be
matchedwith p_f × action[s]}. Corresponding N i is a
tuple (p_f,MAT p_f ). Result of match-action process-
ing for each field can be stored in stateful memory in
case of stateful operations. Or they can simply update
some fields in PHV(ex. updating destination port ac-
cording to routing table). An example MAT for ‘Proto
Type’ field of sample packet (Figure 5) is presented in
Table 4.
Let,MAT_SET In = Set ofMAT for all the nodes(N
i)
in GMAUIn . Formally ,
fMAUIn : PHV S → PHV S
fMAUIn(PHV,G
MAU
In ,MAT_SET In) = PHV
∗
Work-flow of ingress match action unit(MAU In) is
presented in Fig. 10.
CTP features: a) data type support in MAT: data
types (int, float, bit pattern, string etc.) allowed for
lookup in MAT b) matching type support in MAT: type
of look ups (longest prefix match, ternary, exact, range
etc.) are supported c) MAT size & number controlling:
MAT size are fixed length or their size can be declared
in compile time d) availability of stateful action: is it
possible to maintain state information about the flows
e) availability of stateful data structure: what are the
data structures for keeping state information about the
flows (register, counter etc.). Custom data structure can
be created or not f) state sharing among different com-
ponents: are the flow states shareable among different
components of AVS. g) MAT modifiability from DPP:
is the MAT tables modifiable from the data plane pro-
gram. This is necessary for advanced and faster decision
making in data plane [61]. h) available action types:
what are the different type of actions that can be done
on the fields of a PHV.
RTC features: a) MAT controlling: can the CP
add/ remove/update match-action tables entries b) state-
ful data access from CP: can the CP read/write stateful
data of a flow from/to the DP.
6.5 Ingress Deparser(DPRIn)
This component deserializes a PHV . All the fields of
a PHV may not be necessary for emitting to next
component. Some may be dropped or some may be in-
cluded more than once. Moreover all the header fields
declared in PHV may not be valid for all the packets.
In deparser definition, it is defined which of the valid
header fields will be included in the outgoing packet.
Besides the PHV fields, constant data also can be in-
cluded in the packet. Deparser definition is transformed
to a graph(GdIn) in compilation stage. In this graph
[ PHV & GIn
MAU Sent to  MAUIn Block ]  
[ - ]   [ Ni=0 loaded ] [ - ]
Ni Loaded
action_block
Loaded
[ Search PHV.p_f.value in 
Ni.MATp_f ] [ - ]  [ Load matching 
action] [ - ] 
ACTj
Processing 
Started
[ Execute action_block ] [-]
[ Load First (j=0) Action
ACTj ] [ - ]
Flow is 
Stateless
f’s State 
Identified
[  Search Flow in Stateful Data  
Structure ] [ ACTj is   Stateless 
== TRUE ]  [ - ] [ - ]
[  Search Flow in Stateful 
Data  Structure ] [ ACTj is   
Stateless == FALSE ]  [ 
Identify Old Flow(f) info &  
State ] [ f ]
[ Do Stateless Processing] 
[ - ]   [ Modify     PHV
fields ] [ PHV*]
ACTj Processing 
Completed
[Do Stateful Processing] [ - ]
[Store New States &/or 
Modify Old States] [ PHV*]
[ Load Next Action ] [ ACTj+1 != 
Null]  [ Load ACTj+1  ] [ j = j+1 ]
Ni Processing 
Completed
[  Load Ni+1MA]  [ N
i+1
!= Null ]   [i=i+1 ] [ - ]
[ Load Next Action ] [ ACTj+1 == 
Null ]  [ - ] [ j = j+1 ]
MA Processing 
Completed
[  Load Ni+1 ]  
[Ni+1 == Null ]  
[ - ] [ -PHV* ]
Fig. 10: Ingress Match Action EFSM
each node represents a field of PHV or any arbitrary
data to be emitted. DPRin checks validity of each node
and concatenates the field data in data_buffer(a per
packet buffer for storing bits). This also inherently de-
fines each fields relative position in the packet to be
emitted. After deparsing all the nodes of GdIn, payload
is concatenated to the data_buffer. Work-flow of an
ingress deparser unit is represented as EFSM in Figure
11.
fDPRin : PHV S → PHV S
fDPRin(PHV,G
In
d , Null) = PHV
∗
CTP features: a) maximum length limit: maxi-
mum length of a packet that can be created through
the deparser unit b) data type support: data types that
can be emitted by the deparser.
RTC features: a) GInd modifiability: can CP mod-
ify GInd at run time.
6.6 Buffer and Replication Engine(BRE)
BRE is the bridge between ingress and egress stage.
It contains buffer for each egress port. These buffers
can be just simple FIFO buffer or fully programmable
buffer (sec. 6.3 ). But header/metadata based buffer as-
signment of programmable buffer (BE2In section 6.3) is
not needed here, because we assume BRE buffers will be
reserved as per port resource. Moreover, adding extra
12 Debobroto Das Robin, Dr. Javed I. Khan
DEparser
Ni Loaded
Ni  Deparsed
Ni+1 Selected
[ Deparse Ni ]  [  Ni is Valid  = = 
TRUE ]  [ Concat corresponding 
value of Ni to PHV.data_buffer ] 
[ - ]
Ni  Skipped
[ Load GdIn ]  [ - ]  [ load 
Ni=0 ∈ GdIn] [ - ]
[ Select Next Node (Ni+1 )) of 
Gd
In ] [ - ]  [ - ] [ - ]
[  Load Ni+1 ]  [  Ni+1  !=
End Node of GdIn ]  
[ - ] [Ni  = Ni+1 ]
[ Deparse Ni ]  
[ Ni is Valid  = = 
FALSE ]  [ Skip 
deparsing
Ni ]  [ - ]
[ Select Next Node(Ni+1 )  of 
GdIn ] [ - ]  [ - ] [ - ]
[  Load Ni+1 ]  [  Ni+1 = = End 
Node of GdIn ]  [ Mark PHV as 
valid ] [ - ]
Header 
Deparsed
Deparsing
Started
[PHV Sent to 
DPRIn ] [ - ] [ - ] [ -]
Fig. 11: Ingress Deparser EFSM
programmable circuit increases the packet processing
delay.
BRE.Bset =
n⋃
i=1
Bi , n = total number of egress port
For a PHV , egress port selection is done in ingress
match action unit. After ingress deparser, the PHV may
be destined toward either a single egress port(unicast)
or more than one egress port (manycast - multicast,
broadcast).
– For unicast port, PHV is simply moved to egress
port’s buffer with all of its metadata.
– For manycast port, control plane needs to config-
ure group membership of the ports through ‘Many-
cast Group Table(MGT)’. BRE will make necessary
number of copies of the PHV for each member of the
group. Those PHV are placed to buffer of relevant
egress ports.
Insertion of PHV to port buffer (Bp) are handled
by receiver thread of buffer. Sender thread of buffer
removes PHV from port buffer(Bp) and either drops
or passes to egress stage. Work-flow of these 2 threads
are same as of ingress buffer engine discussed in section
6.3.
Formally
fBRE : PHV S → PHV Sn
fBRE(PHV,Null,MGT ) = n copies of PHV
Here, n = total number of egress port in a manycast
group. For unicast packets n = 1.
CTP features: a) all of the CTP features of BEin
(except BCT )
RTC features: a) all of the RTC features of BEin
(except BCT configuration) b) Manycast group mem-
bership table configuration: can CP modify manycast
group and their members at runtime.
6.7 Egress Parser (PRE)
After PHV are removed from BRE, they are sent to
egress parser. Structurally it is same as ingress parser.
Only difference is, after finishing egress parsing of packet
header, instead of sending to a buffer, PHVs are sent to
egress match-action unit.
fPRE : BS → PHV S
fPRE (PHV.data_buffer,G
p
E , Null) = PHV
∗
6.8 Egress Match Action Unit (MAUE)
Same as ingress match action. Only restriction is, egress
port can not be changed in this stage. Formally ,
fMAUE : PHV S → PHV S
fMAUE (PHV,G
MAU
E ,MAT_SETE) = PHV
∗
6.9 Egress Deparser (DPRE)
Same as ingress deparser. Only difference is, after egress
deparsing a PHV is sent to scheduler instead of buffer
and replication engine.
fDPRE : PHV S → PHV S
fDPRE (PHV,G
d
E , Null) = PHV
∗
6.10 Scheduler(S)
After egress processing is complete, a PHV is passed
to Scheduler(S) for transmission to next hop. Schedul-
ing defines order and time [57] of a PHV’s trans-
mission. To achieve these goals, PHV is needed to be
stored in specified order (according to scheduling algo-
rithm) inside an appropriate scheduler data structure
(SDS) (i.e. Queue, collection of queue, tree etc.). Dif-
ferent hardware implementation can use different data
structure. From an abstract point of view, SDS main-
tains an ordered PHV Set (Section 5.2.5). From SDS
PHV’s are selected according to scheduling algorithm
for transmission at appropriate time. As there is no uni-
versal scheduling algorithm [58] to match all kind of
Toward an Abstract Model of Programmable Data Plane Devices 13
application goals, there is no common implementation
of scheduler unit also.
To create a common and generalized abstract model,
2 abstract interfaces are assumed in conjunction with
SDS. Different scheduling algorithm will implement these
2 interfaces differently based on data structures pro-
vided by the hardware. CP can adjust behavior of S
through configuring scheduling algorithm parameters(Sparam).
Example: table for configuring weights of weighted fair
queuing algorithm. These 2 abstract interface are fol-
lowing:
– Insert(PHV): Implementation of this interface cal-
culates the relative order of a PHV inside SDS and
places it in corresponding position. Let, SchedulingOrder
is the metadata field that contains the PHV’s order
in SDS. Now assume, PHV_SET
′
is a set of PHV
ordered by SchedulingOrder (in reality this order
may depend in more than one field) stored in SDS.
PHV_SET
′
= {PHV 1, PHV 2, PHV 3, ..
.........., PHV n}
Finding a new PHV(PHV new)’s scheduling order
requires computation of PHV new.SchedulingOrder
from following types of data.
– value in PHV new’s header fields(Hf ) and/or meta-
data (Mf )
– stateful information about the flow, computed
and stored by switch
– CP configured parameter such as weight for a
weighted fair queuing scheduling
Insert interface provides a total order (Section 5.2.5)
on (PHV_SET ,<) based on SchedulingOrder where,
PHV_SET = PHV_SET
′ ∪ {PHV new}
(PHV_SET ,<) = {PHV 1, PHV 2, PHV p,
.....PHV new, PHV q, ................., PHV n}
Therefore, insert interface can be functionally rep-
resented as
fSinsert : PHV S → Ordered PHV Set(SDS,≤)
fSinsert(PHV, Insert Interface Implementation,
Sparam) = PHV_SET
Computing PHV new.SchedulingOrder can be done
either in match-action stage and result can be car-
ried to scheduler unit through PHV. Or also can be
computed in scheduler unit alone. After that, sched-
uler will insert the packet at appropriate location
in the data structure. In case of storage shortage a
low priority packet can be dropped or the packet
in consideration itself can be dropped depending on
scheduling algorithm.
– Remove(): This interface’s implementation picks
the next PHV to be transmitted from SDS and
decides when the packet will be actually transmitted
through Portegress.
fSremove : PHV_SET → PHV S
fSremove(Null, Remove Interface Implemen
tation, Sparam) = PHV
CTP features: a) custom insert and remove inter-
face: is it possible to provide implementation of these
2 interfaces for custom scheduling algorithm b) custom
SDS creation: custom SDS can be created for complex
scheduling algorithm or not c) cross component access:
can other components in the pipeline access SDS (ex.
clear a queue based on certain event detected at match
action unit) or it’s properties (occupancy rate, priority
etc.).
RTC features: a) insert and remove interface pa-
rameter modifiability: can the control plane configure
parameters for controlling behavior of insert and re-
move functions b) SDS property access: can control
plane access SDS properties c) Sparam controlling: can
CP control scheduling algorithm parameters at run time.
6.11 Egress Port (PortE)
After a PHV is selected for emitting, egress port (PortE)
transmits the content of PHV.data_buffer. How a
hardware level frame is created and transmitted is out
of the scope of our discussion. Like ingress ports, egress
ports also do not have any kind of programmability in
terms of CTP and RTC features.
fPortIn : PHV S → BS
fPortIn(PHV,Null,Null) = PKT,
7 Programmability Level Comparison
PDP devices enables complex computation on packets
in data plane. But limited memory, limited set of ac-
tions and requirement of maintaining line rate makes
core switches unsuitable for complex algorithm. On the
other hand, smart-NIC based packet processing in slow
path (cpu based processing) provides opportunity for
more complex computation. But achieving high speed
line rate in such environments still remains a big is-
sue. To understand what kind of algorithms can be im-
plemented in data plane with available programmable
switches in the current market, we have selected fol-
lowing representative platforms for comparison. Flex-
Pipe[18] is Intel’s Openflow supported programmable
14 Debobroto Das Robin, Dr. Javed I. Khan
Table 5: Few important algorithms and availability of relevant programmability features
Application Required ProgrammabilityFeatures
Relevant AVS
Component FlexPipe Arista 7170 Agilio CX
New protocol;
Telemetry;
On the fly configuration
of packet parsing and
de-parsing logic
Parser and deparser No Yes Yes
5g connectionless
communication;
Mobility management;
Pausing packets at
intermediate switches
of a path
Ingress buffer engine
(programmable buffer) No Partial Yes
Congestion Control;
Traffic engineering;
Programmable packet
scheduling Scheduler No No No
Video streaming; Manycast packet transfer Buffer and replicationengine Partial Partial Yes
Traffic monitoring;
Flow tracking;
Flow Identification;
Flow statistics;
Match-action unit with
Stateful Data Structure; Partial Yes Yes
switching platform. It is selected as one of the early gen-
eration programmable switch of recent times.Tofino[49]
is the most prominent and commercially successful pro-
grammable switching chip based on RMT architecture.
It is being used by various switch vendors [48]. It’s vari-
ous features are protected under barefoot non disclosure
agreement. For comparison, we have chosen Arista 7170
[47] platform. It is based on Tofino chip and supports
P416. Netronome Agilio Cx [46] (with 4000/6000
family NFP) is selected as a smnart-NIC based pro-
grammable switching platform. It can be programmed
using micro-C and P4.
In table 5, we have listed few important types of
algorithms, what are their crucial tasks that needs sup-
port from programmable switches (not achievable with
traditional switches) and whether these tasks are sup-
ported by selected set of devices. From table 5, it is
clear that, all the devices in the market are not equally
programmable. To better understand their programma-
bility level, we have scored those selected platforms
in table 6 considering AVS components as base. Be-
sides the selected programmable data plane platforms
we have also included PSA [27]. PSA is the most ma-
tured switch architecture by P4 community and it is
based on P416. AVS is also inspired by PSA. Until
now, there is no hardware that follows full specifica-
tion of PSA. But any architecture can be simulated in
software level. Bmv2[13] is the reference implementa-
tion of P4, which can simulate various switch architec-
tures including PSA. PSA is still not fully supported
in bmv2. For comparison purpose, we assumed that
PSA will support P416 and it will run on bmv2 sim-
ulation platform. Though PSA is still not realized, we
believe comparison of existing hardware architecture
with PSA (on bmv2) can give a clear picture of current
state of the art. We have also included T-switch which
represents most common category of traditional non-
programmable L2/L3 switches for better understanding
of the scores.
Table 6, gives a 2-dimensional comparison among
the selected platforms. Firstly, each component of AVS
provides a specific type of programmablity in data plane.
Comparing any platform on the basis of whether equiv-
alent AVS components exists or not says whether equiv-
alent functionality can be achieved in a platform or not.
Secondly, eqn. 1 gives a functional structure for each of
the component. Scoring the platforms based on parame-
ters (I, O, ProcLogic and Confparam) of this equation
gives a fine grained view of each platform’s progamma-
bility level. Details comparison behind the scoring can
be found in our full technical report [53].
The scoring system is following: I,O and Confparam
represents some abstract data types.
– NA- Programmability is not applicable here. Ex-
ample: for all of the selected platforms, input (I) to
ingress parser is a set of bits (PHV.data_buffer).
Programmability is not needed for input to an ingress
parser.
– 0- The component is non programmable (ex. T-
switch).
– 1- Selectable from a set of predefined data types.
Example: FlexPipe [18] hardware can only parse se-
lected set of L2-L4 header fields.
– 2- New abstract data type (i.e. struct. class etc.)
can be created. Example: in P4 supported devices,
packet header definition can be created from struct,
enum etc.
FOR ProcLogic:
– NA- Programmability is not applicable here. Exam-
ple: PSA has no provision for ingress buffer engine.
– 0- Non programmable component. Ex, traditional
switch not supports any programmability for match-
action units.
– 1- Only selectable from a set of pre-implemented
logic /algorithm. Ex, in Flexpipe logic for match-
Toward an Abstract Model of Programmable Data Plane Devices 15
Table 6: Eq. 1 Based Programmability Comparison Ma-
trix
E1 = I, E2 = O, E3 = Confparam, E4 = ProcLogic
Features
Component Products E1 E2 E3 E4
Parser
FlexPipe NA 1 NA 1
Arista 7170 NA 2 NA 2
Agilio Cx NA 2 NA 2
PSA NA 2 NA 2
T-switch NA 0 0 0
Ingress
Buffer
Engine
FlexPipe 1 1 1 0
Arista 7170 2 2 1 0
Agilio Cx 2 2 2 2
PSA NA 2 NA NA
T-switch 0 0 0 0
Ingress
Match
Action
Unit
FlexPipe 1 1 1 1
Arista 7170 2 2 2 3
Agilio Cx 2 2 2 3
PSA 2 2 2 3
T-switch 0 0 0 0
Deparser
FlexPipe 1 2 NA 1
Arista 7170 2 2 NA 2
Agilio Cx 2 2 NA 2
PSA 2 3 NA 2
T-switch 0 3 NA 0
Buffer &
Replication
Engine
(BRE)
FlexPipe 1 1 1 0
Arista 7170 2 2 1 0
Agilio Cx 2 2 2 2
PSA 2 2 1 0
T-switch 0 0 0 0
Scheduler
FlexPipe 1 1 1 1
Arista 7170 2 2 1 1
Agilio Cx 2 2 2 3
PSA 2 2 1 1
T-switch 0 0 0 0
action units are not programmable, only selectable
from a predefined set of OpenFlow based actions.
– 2- New algorithms (ProcLogic) can be implemented
but stateful actions (any action that can fetch and
store a flow state) not supported. Ex. parser of P4
supported switches can be programmed to parse
headers but no stateful memory is not available here
.
– 3- New algorithms (ProcLogic) can be implemented
with stateful actions support. Ex. P4 supported switches
can perform action and store results in counter or
register.
8 Motivating Use Cases
Abstraction plays very important role in SDN [28]. Ab-
stractions used in SDN are hierarchical in nature and
used in different layers [8]. Hardware abstraction layer is
placed over programmable hardware layer and provides
a uniform view to other layers (‘Device and resource
Abstraction Layer (DAL)’ [28]). As an instance of hard-
ware abstraction layer, AVS hides low level hardware
complexity and it can provide a uniform view of hard-
ware layer to both control plane and data plane applica-
tion. Hence its use cases can be found in every aspect of
a truly programmable data plane device. We are men-
tioning few important use cases here.
8.1 Modular Architecture Design & Development
Modular architecture design is the goal of good design.
But majority of the programmable data plane architec-
ture in literature are expressed in informal language.
This makes modular design harder for 3 major rea-
sons: a) lack of clear description about the role of a
component b) lack of clear boundary between 2 compo-
nents c) how the components connect with each other in
pipeline. These shortcomings create bottleneck in inde-
pendent design and optimization of components. It also
brings disadvantages in designing new pipeline based on
reuse of those components. A modular abstraction layer
can solve these issues.
Consider a PDP architecture, where boundary be-
tween Ingress Buffer Engine (BEIn) and Ingress
Match Action Unit (MAU In) is not clear . As both
the components have match-action semantics, BEIn
can be designed as a part of MAU In. After header
matching, storing in and removing packet from buffer
can be designed as one of the actions of MAU In. Now,
consider a smart-NIC based packet processing architec-
ture, where packets are moved to userspace and pro-
cessed in CPU and/or GPU. Also assume MAU In is
implemented on CPU and GPU . GPUs perform bet-
ter in parallel and batch processing, where buffering
is a per packet processing task not best suitable for
GPU. Implementing buffer engine as part of match-
action unit using GPU needs costly data transfer to
and from memory to GPU. Or for smart-NIC based en-
vironments packets are needed to be moved from smart-
NIC buffer to GPU. This data movement often loses the
16 Debobroto Das Robin, Dr. Javed I. Khan
GPU performance gain. Hence it is better to implement
buffer engine as a separate component. If buffer engine
functionality is merged withMAU In optimal hardware
performance can not be achieved in this case. In this ex-
ample, if AVS like abstraction layer is used, data plane
application developers can write logic without thinking
about underlying CPU and/or GPU based implemen-
tation. AVS components to actual hardware mapping
is done by the compilers. Thus tasks of BEIn can be
executed on CPU only and MAU In can be mapped
to CPU and/or GPU depending on performance goal.
Moreover, new hardware (ex. specialized chip or FPGA)
can be designed and optimized independently for each
of the components.
8.2 Virtualization & New Pipeline Design
Equation 1 provides a structured framework for each
component. It clearly defines the input, output, packet
processing logic and parameters for how to configure
run-time behavior of a component through control plane
(Confparam). Communication among the components
is only through passing parameters. It removes control
dependency among components. This gives several ad-
vantages: a) AVS gives a hardware independent and
portable representation of data plane device which can
be used to slicing the hardware layer b) DPPs can be
developed based on machine model provided by AVS
c) AVS can be used to create a virtual switch (VM like
entity) for data plane d) switch hypervisor like entities
can execute or migrate same DPP to different hardware
architecture.
Consider a smart-NIC based scenario where the dat-
apath is moved to userspace using DPDK. Also as-
sume, the data plane prorgam (DPP) is assigned to
do both IPv4/6 packet processing and running an algo-
rithm for accelerating ML algorithms through in net-
work aggregation technique [55, 63]. Aggregation is ba-
sically mathematical processing and computationally
expensive which can be executed in parallel fashion.
Let, IPv4/6 packet processing is expressed as MAT IP
and aggregation based packet processing logic is ex-
pressed as MATAGG. Now, to make packet aggrega-
tion faster, MATAGG is assigned to be executed on
GPU. And IPv4/6 packet processing ( MAT IP ) tasks
are executed on CPU. Also assume, CPU can process
1 IPv4/6 packet in 1 cycle where as GPU can aggre-
gate 10 packets in a single cycle and produce result of
aggregation as one packet. As CPU and GPU run in dif-
ferent speed, synchronizing and scheduling the packet
processing over them is very important. Without a clear
definition and structure ofMAT IP andMATAGG, syn-
chronizing packet processing over CPU and GPU is
not possible. Moreover based on underlying server ca-
pability number of CPU cores may differ and GPUs
may not be available at certain time. To handle these
kind of scenarios, a VM like entity for data plane is
mandatory. AVS can works as a virtual switch for
data plane.Moreover, modular representation provided
by AVS , can also enable switch hypervisors to assign
and synchronize execution of DPP over various types
of hardware.
8.3 Testing & Verification
Earlier data plane devices were mainly designed for ex-
ecuting network protocol. But in programmable data
plane devices, more and more application layers tasks
are pushed toward data plane. These devices can be
loaded with new data plane program at any period of
their life time. Testing them or validating some prop-
erty at run time over these programs are very impor-
tant. Without a common abstraction layer and work-
flow, testing programs and verifying properties over het-
erogeneous hardware architecture increases the cost and
complexity. To write test cases, a common set of packet
processing state among switches are required. Without
a common abstraction layer and a common set of packet
processing states, how various hardware vendors imple-
ments packet processing logic can differ.
Consider client-server scenario of Fig. 12. At a cer-
tain time H1 got disconnected from server (S) due to
H1-Sw link failure. Detecting the link failure switch
(Sw) stores all packets directed for H1 in Bi of ingress
buffer engine (section 6.3) . When Bi becomes full, Sw
sends a notification to controller. Upon receiving this
notification, controller initiates migration of the whole
network (migration including switch state a packet stored
in buffer [35]) from one data center to another. Clearly,
this kind of DPP depends on the "Bi is Full" state of
buffer receiver thread (sec. 6.3, Fig. 8). For testing them
and ensuring uniform behavior, switch vendor of both
the data center need to support ingress buffer engine
with common packet processing states. Use of AVS
as an abstraction layer provides a common abstract
switch over heterogeneous architecture and DPP de-
veloper gets a uniform view of the hardware. Moreover
AVS expresses behavior of the components through
EFSM. Common abstraction of the components and
common set of packet processing states of the compo-
nents can be leveraged together to write portable code
and testing them.
Toward an Abstract Model of Programmable Data Plane Devices 17
Client Host 
(H2)
Client Host 
(H1) 
Switch (Sw) Server (S)
Fig. 12: Server-client communication example
8.4 Network Function Modeling
Eq. (1) enables representation of each components of
AVS in a uniform and hardware independent manner.
Composing fc for all the components and concatenat-
ing them represents AVS as a transfer function (τ)
[34]. Compiler translates this functional representation
to hardware instruction. Any network function (NF)
programmed on AVS can be expressed as τ , where I,
O, ConfParam, ProcLogic are different for each NF .
Each link of a network transfers a PKT from one hop
to another. Link also can be represented as a function
(Γ ) of same structure.
Γ : BS → BS
Γ (PKT,MediumAccessLogic, Confparam) = PKT
Γ can be considered as the functional representation of
data link layer. Depending on medium, corresponding
Medium
AccessLogic, (i.e. ring/mesh/star networks, CSMA, CSMA/
CD etc.) and Confparam (i.e. persistent level for CSMA)
can be different. Based on these, a PKT ’s transmis-
sion and propagation delay can be different. These de-
lays can be derived from time stamp (passed as Hf ) at
sender node and receiver node. Again applying τ and Γ
a whole network topology can be represented as a topol-
ogy function (ψ) [34]. These functional representation
are based on abstraction layer and they are hardware
independent representation with well defined parame-
ters for each component. This kind of representation
provides various benefits. Few of them are
– Network algebra: Network algebra is an old and
deeply studied topics. With rise of programmable
data plane device, it is more relevant and applicable
for today’s network. Expressing network functions
and networks in a functional paradigm allows usage
of formal methods of network algebra [2] [34].
– Network Delay Modeling: Time-stamping [44]
packet in data plane is a strong concept with many
usage [45]. But maintaining time-stamp only at en-
try and exit point of a switch can’t provide deep in-
formation about various type of delay. Time-stamp
can be maintained at entry (time-stampEntryC ) and
exit (time-stampExitC ) point of each component in
packet metadata (PKTMetadata). Delay inside a com-
ponent (C) can be calculated as
DC = time-stampExitC − time-stampEntryC
Thus, fine grained information about various delay
can be collected. One of the most simple but pow-
erful model of delay is Network Delay. It can be
computed as following
NetworkDelay =
Queuing Delay + Processing Delay+
(TransmissionDelay + PropagationDelay)
= DBEIn + (DPRIn + DMAUIn + DDPRIn
+DBRE + DPRE + DMAUE + DDPRE
+DS) + DPRIn + DΓ
Complex data plane processing with time stamp can
be used for further complex measurement of various
types of delay categorized by source [7].
9 Conclusion
In this work, we proposed the design of AVS , a modu-
lar hardware abstraction layer for programmable data
plane devices. We are working on a bmv2 [13] based im-
plementation of AVS . Our work on representing com-
ponents of abstraction layer for data plane devices with
a well defined functional structure and their work flow
in EFSM can provide a strong base for optimizing,
bench marking and comparing different programmable
data plane devices. We invite research community to
investigate how to improve this formal structure and
develop frameworks for using it. We have also ana-
lyzed the programmability features of different com-
ponents and compared few important products based
on them. Deriving formal relation between what set of
programmability features can support a specific class
of algorithms in data plane can be a promising re-
search direction. Besides this, designing chip directly
from abstract model also can be a very promising re-
search scope.
References
1. Abhashkumar, A., Lee, J., Tourrilhes, J.,
Banerjee, S., Wu, W., Kang, J.-M., and
Akella, A. P5: Policy-driven optimization of p4
pipeline. In Proceedings of the Symposium on SDN
Research (New York, NY, USA, 2017), SOSR ’17,
ACM, pp. 136–142.
18 Debobroto Das Robin, Dr. Javed I. Khan
2. Anderson, C. J., Foster, N., Guha, A., Jean-
nin, J.-B., Kozen, D., Schlesinger, C., and
Walker, D. Netkat: Semantic foundations for net-
works. SIGPLAN Not. 49, 1 (Jan. 2014), 113–126.
3. Benson, T. A. In-network compute: Considered
armed and dangerous. In Proceedings of the Work-
shop on Hot Topics in Operating Systems (New
York, NY, USA, 2019), HotOS ’19, ACM, pp. 216–
224.
4. Bifulco, R., and Rétvári, G. A survey on the
programmable data plane: Abstractions architec-
tures and open problems. In Proc. IEEE HPSR
(2018), pp. 1–7.
5. Bosshart, P., Daly, D., Gibb, G., Izzard,
M., McKeown, N., Rexford, J., Schlesinger,
C., Talayco, D., Vahdat, A., Varghese, G.,
and Walker, D. P4: Programming protocol-
independent packet processors. SIGCOMM Com-
put. Commun. Rev. 44, 3 (July 2014), 87–95.
6. Bosshart, P., Gibb, G., Kim, H.-S., Vargh-
ese, G., McKeown, N., Izzard, M., Mujica,
F., and Horowitz, M. Forwarding metamor-
phosis: Fast programmable match-action process-
ing in hardware for sdn. In Proceedings of the ACM
SIGCOMM 2013 Conference on SIGCOMM (New
York, NY, USA, 2013), SIGCOMM ’13, ACM,
pp. 99–110.
7. Briscoe, B., Brunstrom, A., Petlund, A.,
Hayes, D., Ros, D., Tsang, I., Gjessing, S.,
Fairhurst, G., Griwodz, C., and Welzl, M.
Reducing internet latency: A survey of techniques
and their merits. IEEE Communications Surveys
Tutorials 18, 3 (thirdquarter 2016), 2149–2196.
8. Casado, M., Foster, N., and Guha, A. Ab-
stractions for software-defined networks. Commun.
ACM 57, 10 (Sept. 2014), 86–95.
9. Casado, M., Koponen, T., Ramanathan, R.,
and Shenker, S. Virtualizing the network for-
warding plane. In Proceedings of the Workshop
on Programmable Routers for Extensible Services
of Tomorrow (2010), ACM, p. 8.
10. Choi, S., Burkov, B., Eckert, A., Fang, T.,
Kazemkhani, S., Sherwood, R., Zhang, Y.,
and Zeng, H. Fboss: building switch software at
scale. In Proceedings of the 2018 Conference of the
ACM Special Interest Group on Data Communica-
tion (2018), ACM, pp. 342–356.
11. Choi, S., Long, X., Shahbaz, M., Booth, S.,
Keep, A., Marshall, J., and Kim, C. Pvpp: A
programmable vector packet processor. In Proceed-
ings of the Symposium on SDN Research (2017),
ACM, pp. 197–198.
12. Chole, S., Fingerhut, A., Ma, S., Sivaraman,
A., Vargaftik, S., Berger, A., Mendelson,
G., Alizadeh, M., Chuang, S.-T., Keslassy,
I., Orda, A., and Edsall, T. drmt: Disaggre-
gated programmable switching. In Proceedings of
the Conference of the ACM Special Interest Group
on Data Communication (New York, NY, USA,
2017), SIGCOMM ’17, ACM, pp. 1–14.
13. Consortium, P. L. Behavioral model version 2,
2017.
14. Consortium, P. L. Switchsai, Dec 21, 2016.
15. Consortium, T. P. L. The p4 language specifi-
cation version 1.0.4, 2017.
16. Consortium, T. P. L. P4 16 language specifica-
tion:version 1.1.0, 2018.
17. Consortium, T. P. L. Behavioral model targets,
2019.
18. Corporation, N. D. I. Intel ethernet switch
fm5000/fm6000(datasheet), 2017.
19. Dang, H. T., Sciascia, D., Canini, M., Pe-
done, F., and Soulé, R. Netpaxos: Consensus at
network speed. In Proceedings of the 1st ACM SIG-
COMM Symposium on Software Defined Network-
ing Research (New York, NY, USA, 2015), SOSR
’15, ACM, pp. 5:1–5:7.
20. Dang, H. T., Wang, H., Jepsen, T., Breb-
ner, G., Kim, C., Rexford, J., Soulé, R., and
Weatherspoon, H. Whippersnapper: A p4 lan-
guage benchmark suite. In Proceedings of the Sym-
posium on SDN Research (New York, NY, USA,
2017), SOSR ’17, ACM, pp. 95–101.
21. Feamster, N., Rexford, J., and Zegura, E.
The road to sdn: An intellectual history of pro-
grammable networks. SIGCOMM Comput. Com-
mun. Rev. 44, 2 (Apr. 2014), 87–98.
22. Foundation, L. Data plane development kit,
2018.
23. FOUNDATION, O. Open data plane (odp).
24. FOUNDATION, O. Vector packet processing
(vpp).
25. Foundation, O. N. Openflow switch specifica-
tion: Version 1.3.5, 2015.
26. Gibb, G., Varghese, G., Horowitz, M., and
McKeown, N. Design principles for packet
parsers. In Architectures for Networking and Com-
munications Systems (Oct 2013), pp. 13–24.
27. Group, T. P. A. W. P416 portable switch archi-
tecture (psa)(working draft), 2018.
28. Haleplidis, E., Pentikousis, K., Denazis, S.,
Salim, J. H., Meyer, D., and Koufopavlou,
O. Software-Defined Networking (SDN): Layers
and Architecture Terminology. RFC 7426, Jan.
2015.
29. Hancock, D., and van der Merwe, J. Hy-
Toward an Abstract Model of Programmable Data Plane Devices 19
per4: Using p4 to virtualize the programmable data
plane. In Proceedings of the 12th International on
Conference on Emerging Networking EXperiments
and Technologies (New York, NY, USA, 2016),
CoNEXT ’16, ACM, pp. 35–49.
30. He, M., Basta, A., Blenk, A., Deric, N., and
Kellerer, W. P4nfv: An nfv architecture with
flexible data plane reconfiguration. In 2018 14th
International Conference on Network and Service
Management (CNSM) (2018), IEEE, pp. 90–98.
31. Høiland-Jørgensen, T., Brouer, J. D.,
Borkmann, D., Fastabend, J., Herbert, T.,
Ahern, D., and Miller, D. The express data
path: Fast programmable packet processing in the
operating system kernel. In Proceedings of the 14th
International Conference on Emerging Network-
ing EXperiments and Technologies (2018), ACM,
pp. 54–66.
32. Jose, L., Yan, L., Varghese, G., and McK-
eown, N. Compiling packet programs to recon-
figurable switches. In 12th USENIX Symposium
on Networked Systems Design and Implementation
(NSDI 15) (Oakland, CA, 2015), USENIX Associ-
ation, pp. 103–115.
33. Kaljic, E., Maric, A., Njemcevic, P., and
Hadzialic, M. A survey on data plane flexibility
and programmability in software-defined network-
ing. IEEE Access 7 (2019), 47804–47840.
34. Kazemian, P., Varghese, G., and McKeown,
N. Header space analysis: Static checking for net-
works. In Presented as part of the 9th USENIX
Symposium on Networked Systems Design and Im-
plementation (NSDI 12) (San Jose, CA, 2012),
USENIX, pp. 113–126.
35. Keller, E., Ghorbani, S., Caesar, M., and
Rexford, J. Live migration of an entire network
(and its hosts). In Proceedings of the 11th ACM
Workshop on Hot Topics in Networks (New York,
NY, USA, 2012), HotNets-XI, ACM, pp. 109–114.
36. Kogan, K., Menikkumbura, D., Petri, G.,
Noh, Y., Nikolenko, S., Sirotkin, A., and
Eugster, P. A programmable buffer management
platform. In 2017 IEEE 25th International Con-
ference on Network Protocols (ICNP) (Oct 2017),
pp. 1–10.
37. Kogan, K., Menikkumbura, D., Petri, G.,
Noh, Y., Nikolenko, S., Sirotkin, A., and
Eugster, P. A programmable buffer management
platform. In 2017 IEEE 25th International Confer-
ence on Network Protocols (ICNP) (2017), IEEE,
pp. 1–10.
38. Kohler, T., Mayer, R., Dürr, F., Maaß, M.,
Bhowmik, S., and Rothermel, K. P4cep: To-
wards in-network complex event processing. In
Proceedings of the 2018 Morning Workshop on In-
Network Computing (New York, NY, USA, 2018),
NetCompute ’18, ACM, pp. 33–38.
39. Lin, Y., Kozat, U. C., Kaippallimalil, J.,
Moradi, M., Soong, A. C., and Mao, Z. M.
Pausing and resuming network flows using pro-
grammable buffers. In Proceedings of the Sym-
posium on SDN Research (New York, NY, USA,
2018), SOSR ’18, ACM, pp. 7:1–7:14.
40. Liu, J., Hallahan, W., Schlesinger, C.,
Sharif, M., Lee, J., Soulé, R., Wang, H.,
Caşcaval, C., McKeown, N., and Foster, N.
P4v: Practical verification for programmable data
planes. In Proceedings of the 2018 Conference of
the ACM Special Interest Group on Data Commu-
nication (New York, NY, USA, 2018), SIGCOMM
’18, ACM, pp. 490–503.
41. Malloy, D., Maltz, D., Williams, C., Man-
ickam, A. S., Daparthi, A., Wichmann, C.,
Lazar, M., Sane, S., Baldonado, O., Fang,
T., Simpkins, A., Gale, B., Cummings, U.,
Daly, D., Penner, M., Kadosh, M., Baz,
I., and Raveh, A. Switch abstraction interface
v0.9.2, 2015.
42. McKeown, N., Anderson, T., Balakrishnan,
H., Parulkar, G., Peterson, L., Rexford, J.,
Shenker, S., and Turner, J. Openflow: En-
abling innovation in campus networks. SIGCOMM
Comput. Commun. Rev. 38, 2 (Mar. 2008), 69–74.
43. Mittal, R., Agarwal, R., Ratnasamy, S., and
Shenker, S. Universal packet scheduling. In Pro-
ceedings of the 14th ACM Workshop on Hot Topics
in Networks (New York, NY, USA, 2015), HotNets-
XIV, ACM, pp. 24:1–24:7.
44. Mizrahi, T., and Moses, Y. The case for data
plane timestamping in sdn. In 2016 IEEE Con-
ference on Computer Communications Workshops
(INFOCOM WKSHPS) (April 2016), pp. 856–861.
45. Mizrahi, T., Yerushalmi, I., Melman, D. T.,
and Browne, R. Network Service Header
(NSH) Context Header Allocation: Timestamp.
Internet-Draft draft-mymb-sfc-nsh-allocation-
timestamp-05, Internet Engineering Task Force,
Oct. 2018. Work in Progress.
46. NETRONOME. Agilio cx smartnics, 2018.
47. Networks, A. Arista 7170 series, 2018.
48. Networks, B. Barefoot partners, 2018.
49. Networks, B. Barefoot tofino, 2019.
50. Nötzli, A., Khan, J., Fingerhut, A., Bar-
rett, C., and Athanas, P. P4pktgen: Auto-
mated test case generation for p4 programs. In Pro-
ceedings of the Symposium on SDN Research (New
20 Debobroto Das Robin, Dr. Javed I. Khan
York, NY, USA, 2018), SOSR ’18, ACM, pp. 5:1–
5:7.
51. Patra, P. G., Rothenberg, C. E., and Pon-
gracz, G. Macsad: High performance dataplane
applications on the move. In 2017 IEEE 18th Inter-
national Conference on High Performance Switch-
ing and Routing (HPSR) (2017), IEEE, pp. 1–6.
52. Rim, S. Y., Cui, Z., and Qian, L. High per-
formance packet processor architecture for network
virtualization: Programmable packet processor ar-
chitecture as a data flow machine. In Proceedings of
the 2018 International Conference on Algorithms,
Computing and Artificial Intelligence (New York,
NY, USA, 2018), ACAI 2018, ACM, pp. 6:1–6:5.
53. Robin, D. D., and I.Khan, D. J. Toward an
abstract model of programmable data plane devices
(technical report).
54. Sapio, A., Abdelaziz, I., Aldilaijan, A.,
Canini, M., and Kalnis, P. In-network compu-
tation is a dumb idea whose time has come. In Pro-
ceedings of the 16th ACM Workshop on Hot Topics
in Networks (New York, NY, USA, 2017), HotNets-
XVI, ACM, pp. 150–156.
55. Sapio, A., Canini, M., Ho, C.-Y., Nel-
son, J., Kalnis, P., Kim, C., Krishna-
murthy, A., Moshref, M., Ports, D. R., and
Richtárik, P. Scaling distributed machine learn-
ing with in-network aggregation. arXiv preprint
arXiv:1903.06701 (2019).
56. Shahbaz, M., Choi, S., Pfaff, B., Kim, C.,
Feamster, N., McKeown, N., and Rexford,
J. Pisces: A programmable, protocol-independent
software switch. In Proceedings of the 2016 ACM
SIGCOMM Conference (2016), ACM, pp. 525–538.
57. Sivaraman, A., Subramanian, S., Agrawal,
A., Chole, S., Chuang, S.-T., Edsall, T.,
Alizadeh, M., Katti, S., McKeown, N., and
Balakrishnan, H. Towards programmable packet
scheduling. In Proceedings of the 14th ACM Work-
shop on Hot Topics in Networks (New York, NY,
USA, 2015), HotNets-XIV, ACM, pp. 23:1–23:7.
58. Sivaraman, A., Winstein, K., Subramanian,
S., and Balakrishnan, H. No silver bullet: Ex-
tending sdn to the data plane. In Proceedings of
the Twelfth ACM Workshop on Hot Topics in Net-
works (New York, NY, USA, 2013), HotNets-XII,
ACM, pp. 19:1–19:7.
59. Sivaraman, V., Narayana, S., Rottenstre-
ich, O., Muthukrishnan, S., and Rexford, J.
Heavy-hitter detection entirely in the data plane.
In Proceedings of the Symposium on SDN Research
(New York, NY, USA, 2017), SOSR ’17, ACM,
pp. 164–176.
60. Tu, W., Ruffy, F., and Budiu, M. Linux net-
work programming with p4. In Linux PlumbersâĂŹ
Conference 2018 (2018).
61. Turkovic, B., Kuipers, F., van Adrichem, N.,
and Langendoen, K. Fast network congestion
detection and avoidance using p4. In Proceedings
of the 2018 Workshop on Networking for Emerg-
ing Applications and Technologies (New York, NY,
USA, 2018), NEAT ’18, ACM, pp. 45–51.
62. Vörös, P., Horpácsi, D., Kitlei, R., Leskó,
D., Tejfel, M., and Laki, S. âĂđt4p4s: A
target-independent compiler for protocolindepen-
dent packet processorsâĂİ. In IEEE HPSR (2018),
pp. 17–20.
63. Wang, S.-Y., Wu, C.-M., Linm, Y.-B., and
Huang, C.-C. High-speed data-plane packet ag-
gregation and disaggregation by p4 switches. Jour-
nal of Network and Computer Applications (2019).
64. Zhang, C., Bi, J., Zhou, Y., Dogar, A. B.,
and Wu, J. Hyperv: A high performance hyper-
visor for virtualization of the programmable data
plane. In 2017 26th International Conference on
Computer Communication and Networks (ICCCN)
(July 2017), pp. 1–9.
