Time Synchronization Solution for FPGA-based Distributed Network Monitoring by Janky, Ferenc Nándor & Varga, Pál
Time Synchronization Solution for FPGA-based
Distributed Network Monitoring
INFOCOMMUNICATIONS JOURNAL
MARCH 2018 • VOLUME X • NUMBER 1 1
Time Synchronization Solution for FPGA-based
Distributed Network Monitoring
Ferenc Nandor Janky and Pal Varga
Time Synchronization Solution for FPGA-based
Distributed Network Monitoring
Ferenc Nandor Janky and Pal Varga
Abstract—Distributed network monitoring solutions face var-
ious challenges with the increase of line speed, the extending
variety of protocols, and new services with complex KPIs. This
paper addresses one part of the first challenge: faster line
speed necessitates time-stamping with higher granularity and
higher precision than ever. Proper, system-wide time-stamping
is inevitable for network monitoring and traffic analysis point
of view. It is hard to find feasible time synchronization solutions
for those systems that have nation-wide, physically distributed
probes.
Current networking equipment reside in server rooms, and
have many legacy nodes. Access to GPS signal is complicated in
these places, and Precision Time Protocol (PTP) does not seem to
be supported by all network nodes in the near future – so high
precision time-stamping is indeed a current problem. This paper
suggests a novel, practical solution to overcome the obstacles.
The core idea is that in real-life, distributed network mon-
itoring systems operate with a few, finite number of probe-
clusters, and their site should have a precise clock provided
by PTP or GPS somewhere in the building. The distribution
of time information within a site is still troublesome, even within
a server rack. This paper presents a closed control loop solution
implemented in an FPGA-based device in order to minimize the
jitter, and compensate the calculated delay.
Keywords—network monitoring, time synchronization, hard-
ware acceleration, closed control loop
I. INTRODUCTION
Network monitoring has a well-established practice at
telecommunication operators. There are fundamentally differ-
ent solutions available – depending on what kind of data are
initially available and how they are gathered. The least flexible
solutions are based on the functional networking elements:
they can provide pre-digested reports, statistical counters, and
occasionally (when not under heavy load), even detailed infor-
mation on the actual messages. Some operators use standalone
protocol analyzers, which do not suffer from the temporal,
load-related bottlenecks – rather, they have spatial data capture
issues: only a segment of the network is visible at any given
time. On the other hand, complete traffic information can
be gathered by network-wide traffic monitoring. These latter
solutions are based on passive, distributed probes; central pro-
cessing entities; and client software – also distributed – at the
operating personnel. This paper discusses a peculiar problem
of such systems: effective time synchronization among the
entities.
The authors are with the Department of Telecommunications and
MediaInformatics, Faculty of Electrical Engineering and Informatics,
Budapest University of Technology and Economics, Magyar tudósok
körútja 2., 1117 Budapest, Hungary (phone: +36704213213; e-mail:
fecjanky@gmail.com and pvarga@tmit.bme.hu)
Network traffic analysis requires the understanding of the
order of the messages appearing in the network, even if they
appear at different interfaces. This makes high resolution and
high precision time-stamping the basic requirement, beside
lossless message capture. While there are standardized net-
work protocols available for tackling this issue, there are
practical obstacles in their network-wide usage. Although the
Network Time Protocol (NTP) is widely available [1], it cannot
be used as a general purpose synchronization protocol. In fact,
the message transfer delay between NTP clients and servers is
not compensated, hence the different nodes end up setting their
local time to a clock value with a random delay. The typical
order of the forwarding delay in current core routers is in the
0.5-5 microseconds range, depending on the traffic volume –
among other factors. Since the minimum packet interarrival-
time is 0.672 microsecond even at a 1 Gbps link (and 67.2
nanoseconds for a 10 Gbps link), such delays cannot be left
without compensation for time synchronization.
Precision Time Protocol (PTP), on the other hand covers the
delay-compensation issue well [2]. Unfortunately, PTP is not
at all wide-spread, even after 10 years of commercialization
for PTPv2. The concept, however, necessitates that all network
nodes in the path have PTPv2 capability. Otherwise – even
if one node cannot compute and share its delay data –,
compensation of time information is not possible.
Another solution could be to introduce time information of
GPS (Global Positioning System) satellites into the nodes –
this is not feasible, since rack cabinets in server rooms lack
the line of sight.
We can suppose that at least one machine at each monitoring
site has the possibility to get synchronized to the master
clock of the network (e.g. through PTP or GPS). Nevertheless,
synchronizing all clocks within the site with nanosecond-range
precision, is still a challenge.
This paper presents a solution for the time synchronization
issues of systems with FPGA-based monitoring probes. What
makes FPGA a key player here is that hardware-acceleration
removes the jitter of operating system and protocol-stack delay
from the equation. The delay of handling time information
within an FPGA is constant, we can calculate with it precisely
– and compensate this delay for the time-stamp.
In this paper we focus on the time synchronization chal-
lenges of a monitoring site. The implemented solution is based
on the practical pre-requisite that each site has a reference
clock available for the monitoring system. This paper sug-
gests an FPGA-based clock synchronization method for the
distributed monitoring equipment, more precisely, its interface
cards.
DOI: 10.36244/ICJ.2018.1.1
e t rs r  it  t  rt t f l i ti s  
i I f r ti s, Faculty of Electrical Engineering and Informatics, Bud pest 
University of Technol gy and Economics, Magyar tudósok körútja 2., 
1117 Budapest, Hungary ( hone: +36704213213; -mail: fecjanky@gmail.com 
and pvarga@tmit.b e.hu)
Time Synchronization Solution for FPGA-based
Distributed Network Monitoring
MARCH 2018 • VOLUME X • NUMBER 12
INFOCOMMUNICATIONS JOURNAL
Fig. 1. A generic architecture for distributed network monitoring
II. NETWORK MONITORING SUPPORTED BY FPGA-BASED
PROBES
A. The Generic Concept of Distributed Network Monitoring
The distributed network monitoring architecture depicted
by Figure 1 supports local, probe-based pre-processing
(time-stamping, requirement-based packet chunking, filtering
criteria-based distribution) and central, deep analysis (corre-
lation of messages and transactions, data record compilation,
statistics generation), even on-the-fly. The time-based ordering
and interleaving of messages are enabled by the hardware-
accelerated time-stamping, providing nanosecond-range reso-
lution with sub-microsecond precision. The information stored
locally at the distributed Monitoring Probes can be accessed
by client applications of the operator. Besides, the Monitoring
Probes send pre-digested data to the Servers for correlation
(creating e.g. Call Data Records, CDRs), as well as periodic
reports containing their calculated statistics [3].
Since user data and control data are often carried over
the same channels, their division requires message analysis
on network- or transaction-level (e.g., IP- or TCP-level). The
changing traffic patterns force the operators to look for new
tools to process even the user traffic. The first step towards this
is the compilation of XDRs (eXtended Data Records) based on
control- and user-plane messages and transactions. These often
contain message-level timestamps, as well. Based on these
data, the deep traffic analysis tools provide valuable informa-
tion towards business-intelligence and network optimization.
Besides, all nodes can be configured to report directly to the
NOC (Network Operations Center).
Operators use the network-wide, passive monitoring for
fault detection, service quality assurance, and resource plan-
ning, among others [4]. Besides lossless data capture, network
monitoring covers further functions, as well:
– precise time-stamping, ordering;
– compilation, search and fetch of Call Data Records
(CDRs) and Extended Data Records (XDRs);
– calculation and reporting of Key Performance Indicators,
KPIs;
– Call Tracing at various complexity levels;
– bit-wise message decoding for protocol analysis; etc.
All these functions are present in the network monitoring
practice, since beside user-level data analysis, network analysis
is important from connection-level to application-level, as
well.
System elements of the described generic architecture can
be implemented in many ways. In the SGA-7N system –
which serves as the base implementation for the presented
solution – monitoring probes of the presented system are called
“Monitors”. These consist of three main building blocks: a
high performance Field Programmable Gate Array (FPGA)-
based custom hardware platform, a firmware dedicated for
network monitoring, and the probe software [5].
B. FPGA-based packet processing
There are many features that make FPGAs useful in packet
processing tasks [6]. The main concept itself allows parallel
processing of the input data. Different, simultaneous tasks
can be carried out at each clock cycle on the same data,
which in this case is the packet header [7], [8]. Besides,
the input word length is much greater for FPGAs (getting
90 bytes) than for modern CPUs (64 bits). Furthermore,
FPGA are set up in hardware-defined languages, and they are
indeed reconfigurable hardware: their internal wiring can be
changed within milliseconds. These features enable FPGA-
based hardware platforms to become high performance net-
working devices, e.g., network monitors, switches, routers,
firewalls or intrusion detection systems [9]. Nevertheless, as a
network monitoring system, it supports distributed and lossless
packet level monitoring of Ethernet links for 1 or 10 Gbps.
Beside providing sufficient resources for switching and
routing at 1 or 10 Gbps, the design of SGA-GPLANAR [10]
and SGA-10GED [11] used in SGA-7N includes some special,
network monitoring-related requirements, namely
– lossless packet capture,
– 64-bit time-stamping with sub-microsecond resolution,
– header-only capture: configurable depth of decoding,
– on-the-fly packet parsing by hardware [12],
– parameterized packet/flow generator for mass testing
[13],[14].
Various applications then require other supported function-
alities. As an example, the high-speed monitoring application
[15] consists of the following sub-modules:
– time-stamping every frame upon reception;
– packet decoding from layer 2 up to the application layer;
– packet filtering with a reconfigurable rule-set to decide
what we do with a given packet;
– packet chunking: packets can be truncated depending on
the matching rule;
– packet distribution: to distribute packets by different
criteria: IP flows, fragment steering, steering based on
mobile core network parameters, etc.;
– packet encapsulation: monitoring information is stored in
a specified header format.
These features and capabilities make the FPGA a suitable
enabler of hardware acceleration within the Monitors.
III. CHALLENGES AND REQUIREMENTS IN DETAIL
For a distributed monitoring solution described in the pre-
vious sections, there is a strong requirement for having a
monotonic clock. Otherwise, packet reordering would happen
even with a single monitoring node (changing its clock) – and
this is not feasible, since traffic analysis is heavily dependent
upon packet timestamps. As a consequence, the need for
monotonic system time is inherent.
Another challenge comes from the fact that a distributed
monitoring system has its components geographically sep-
arated from each other, therefore the clock frequency and
the time information of the clocks of the nodes have to be
frequency- and phase-synchronized to each other with some
given threshold. This problem has many solutions, e.g., using
GPS based synchronization systems [16]. Although technically
it can work well [17], as a drawback, this requires additional
installation expenditures on an indoor site that has no installed
antenna system to carry the GPS signal inside the building
and could also result in extensive cabling work. A convenient
alternative is to use network time synchronization that utilizes
the telecommunication network for exchanging packets as
per a designated protocol to achieve frequency and phase
synchronization. Examples for this are Network Time Protocol
(NTP) [1] and Precision Time Protocol (PTP) [2].
When speaking about time synchronization, the following
properties describe a clock – which are in-line with the generic
definition of clock properties [18]:
• accuracy – i.e. how good is the time information com-
pared to some reference
• precision – i.e. how precise is a tick of the clock com-
pared to some reference
• stability – i.e. how does the clock frequency change e.g.,
over time or based on external temperature changes etc.
The biggest challenge of all – as usual – is to adapt to the
existing monitoring framework described in II with minimal
modifications to the existing solution, while satisfying all the
precision and accuracy related requirements. As mentioned
before, the platform for proof-of-concept is the SGA-7N
monitoring system, which utilizes FPGA-based monitoring
cards. These are capable of capturing on high-speed network
interfaces – with fine-grained time-stamping capabilities –, and
they have their own, existing time-keeping facilities.
In order to tackle all the above mentioned issues with a
solution fitting into the network monitoring architecture, we
suggested to create a new FPGA-based card that implements
these functions:
• network time synchronization,
• local time synchronization,
• interfacing with the existing nodes – OAMP functions.
The following sections describe this solution, and show its
feasibility in the running monitoring system.
IV. ARCHITECTURE OF THE DISTRIBUTED TIME
SYNCHRONIZED MONITORING SYSTEM
A. Generic concept
For providing easy adaptation into the existing system,
and also taking into account FPGA resource usage, a hybrid
solution has been designed. This solution implements network
time synchronization in a standalone card that distributes the
digital timing information over a dedicated control bus, as
illustrated by Figure 2.
The synchronization framework provides a platform-
independent agent that can be integrated into the existing
FPGA cards’ top level VHDL (VHSIC Hardware Description
Language, [19]) modules, and is used through a well-defined
and portable interface.
The agent itself has low complexity, and as a result, the
solution does not waste CLB (Configurable Logic Block)
resources – as if the whole network synchronization stack
were instantiated N times on all monitoring node cards.
Furthermore, this results in better internal synchronization
compared to the replicated stacks, since those can have skew
to each other (within the boundaries), as specified by their
protocol.
As shown by Figure 2, each node has its own network
synchronization function, therefore the accuracy and precision
Fig. 1. A generic architecture for distributed network monitoring
II. NETWORK MONITORING SUPPORTED BY FPGA-BASED
PROBES
A. The Generic Concept of Distributed Network Monitoring
The distributed network monitoring architecture depicted
by Figure 1 supports local, probe-based pre-processing
(time-stamping, requirement-based packet chunking, filtering
criteria-based distribution) and central, deep analysis (corre-
lation of messages and transactions, data record compilation,
statistics generation), even on-the-fly. The time-based ordering
and interleaving of messages are enabled by the hardware-
accelerated time-stamping, providing nanosecond-range reso-
lution with sub-microsecond precision. The information stored
locally at the distributed Monitoring Probes can be accessed
by client applications of the operator. Besides, the Monitoring
Probes send pre-digested data to the Servers for correlation
(creating e.g. Call Data Records, CDRs), as well as periodic
reports containing their calculated statistics [3].
Since user data and control data are often carried over
the same channels, their division requires message analysis
on network- or transaction-level (e.g., IP- or TCP-level). The
changing traffic patterns force the operators to look for new
tools to process even the user traffic. The first step towards this
is the compilation of XDRs (eXtended Data Records) based on
control- and user-plane messages and transactions. These often
contain message-level timestamps, as well. Based on these
data, the deep traffic analysis tools provide valuable informa-
tion towards business-intelligence and network optimization.
Besides, all nodes can be configured to report directly to the
NOC (Network Operations Center).
Operators use the network-wide, passive monitoring for
fault detection, service quality assurance, and resource plan-
ning, among others [4]. Besides lossless data capture, network
monitoring covers further functions, as well:
– precise time-stamping, ordering;
– compilation, search and fetch of Call Data Records
(CDRs) and Extended Data Records (XDRs);
– calculation and reporting of Key Performance Indicators,
KPIs;
– Call Tracing at various complexity levels;
– bit-wise message decoding for protocol analysis; etc.
All these functions are present in the network monitoring
practice, since beside user-level data analysis, network analysis
is important from connection-level to application-level, as
well.
System elements of the described generic architecture can
be implemented in many ways. In the SGA-7N system –
which serves as the base implementation for the presented
solution – monitoring probes of the presented system are called
“Monitors”. These consist of three main building blocks: a
high performance Field Programmable Gate Array (FPGA)-
based custom hardware platform, a firmware dedicated for
network monitoring, and the probe software [5].
B. FPGA-based packet processing
There are many features that make FPGAs useful in packet
processing tasks [6]. The main concept itself allows parallel
processing of the input data. Different, simultaneous tasks
can be carried out at each clock cycle on the same data,
Fig. 1. A generic architecture for distributed network monitoring
II. NETWORK MONITORING SUPPORTED BY FPGA-BASED
PROBES
A. The Generic Concept of Distributed Network Monitoring
The distributed network monitor ng architecture depicted
by Figure 1 supports local, probe-based pre-processing
(time-stamping, requirement-based packet chunki g, filtering
criteria-based distribution) and entral, deep analysis (corre-
lation of messages and transactions, data record compilation,
statistics generation), even on-the-fly. he time-based ordering
and interleaving of messages are enabled by the har ware-
accelerated time-stamping, providing nanosec d-range reso-
lution with sub-microsecond precision. The information stored
locally at the distributed Monitoring Probes can be access d
by client applications of the operator. Besides, the Monitoring
Probes send pre-digested data to the Servers for correlation
(creating e.g. Call Data Records, CDRs), as well as periodic
reports containing their calculated statistics [3].
Since user data and control data are ften carried over
the same channels, their division equires message analysis
on network- or transaction-level ( .g., IP- or TCP-level). The
changing traffic patterns force the operators to look for new
tools to process even the user raffic. The first step towards this
is the compilation of XDRs (eXtended Data Records) based on
control- and user-plane messages and transactions. These often
contain message-level timestamps, as well. Based on these
data, the deep traffic analysis tools provide valuable informa-
tion towards business-intelligence and network optimization.
Besides, all nodes can be configured to report directly to the
NOC (Network Operations Center).
Operators use the network-wide, passive monitoring for
fault detection, service quality assurance, and resource plan-
ning, among others [4]. Besides lossless data capture, network
monitoring covers further functions, as well:
– precise time-stamping, ordering;
– compilation, search and fetch of Call Data Records
(CDRs) and Extended Data Records (XDRs);
– calculation and reporting of Key Performance Indicators,
KPIs;
– Call Tracing at various complexity levels;
– bit-wise message decoding for protocol analysis; etc.
All these functions are present in the network monitoring
practice, since beside user-level data analysis, network analysis
is important from connection-level to application-level, as
well.
System elements of the described generic architecture can
be implemented in many ways. In the SGA-7N system –
which serves as the base implementation for the presented
solution – monitoring probes of the presented system are called
“Monitors”. These consist of three main building blocks: a
high performance Field Programmable Gate Array (FPGA)-
based custom hardware platform, a firmware dedicated for
network monitoring, and the probe software [5].
B. FPGA-based packet processing
There are many features that make FPGAs useful in packet
processing tasks [6]. The main concept itself allows parallel
processing of the input data. Different, simultaneous tasks
can be carried out at each clock cycle on the same data,
Time Synchronization Solution for FPGA-based
Distributed Network Monitoring
INFOCOMMUNICATIONS JOURNAL
MARCH 2018 • VOLUME X • NUMBER 1 3
Fig. 1. A generic architecture for distributed network monitoring
II. NETWORK MONITORING SUPPORTED BY FPGA-BASED
PROBES
A. The Generic Concept of Distributed Network Monitoring
The distributed network monitoring architecture depicted
by Figure 1 supports local, probe-based pre-processing
(time-stamping, requirement-based packet chunking, filtering
criteria-based distribution) and central, deep analysis (corre-
lation of messages and transactions, data record compilation,
statistics generation), even on-the-fly. The time-based ordering
and interleaving of messages are enabled by the hardware-
accelerated time-stamping, providing nanosecond-range reso-
lution with sub-microsecond precision. The information stored
locally at the distributed Monitoring Probes can be accessed
by client applications of the operator. Besides, the Monitoring
Probes send pre-digested data to the Servers for correlation
(creating e.g. Call Data Records, CDRs), as well as periodic
reports containing their calculated statistics [3].
Since user data and control data are often carried over
the same channels, their division requires message analysis
on network- or transaction-level (e.g., IP- or TCP-level). The
changing traffic patterns force the operators to look for new
tools to process even the user traffic. The first step towards this
is the compilation of XDRs (eXtended Data Records) based on
control- and user-plane messages and transactions. These often
contain message-level timestamps, as well. Based on these
data, the deep traffic analysis tools provide valuable informa-
tion towards business-intelligence and network optimization.
Besides, all nodes can be configured to report directly to the
NOC (Network Operations Center).
Operators use the network-wide, passive monitoring for
fault detection, service quality assurance, and resource plan-
ning, among others [4]. Besides lossless data capture, network
monitoring covers further functions, as well:
– precise time-stamping, ordering;
– compilation, search and fetch of Call Data Records
(CDRs) and Extended Data Records (XDRs);
– calculation and reporting of Key Performance Indicators,
KPIs;
– Call Tracing at various complexity levels;
– bit-wise message decoding for protocol analysis; etc.
All these functions are present in the network monitoring
practice, since beside user-level data analysis, network analysis
is important from connection-level to application-level, as
well.
System elements of the described generic architecture can
be implemented in many ways. In the SGA-7N system –
which serves as the base implementation for the presented
solution – monitoring probes of the presented system are called
“Monitors”. These consist of three main building blocks: a
high performance Field Programmable Gate Array (FPGA)-
based custom hardware platform, a firmware dedicated for
network monitoring, and the probe software [5].
B. FPGA-based packet processing
There are many features that make FPGAs useful in packet
processing tasks [6]. The main concept itself allows parallel
processing of the input data. Different, simultaneous tasks
can be carried out at each clock cycle on the same data,
which in this case is the packet header [7], [8]. Besides,
the input word length is much greater for FPGAs (getting
90 bytes) than for modern CPUs (64 bits). Furthermore,
FPGA are set up in hardware-defined languages, and they are
indeed reconfigurable hardware: their internal wiring can be
changed within milliseconds. These features enable FPGA-
based hardware platforms to become high performance net-
working devices, e.g., network monitors, switches, routers,
firewalls or intrusion detection systems [9]. Nevertheless, as a
network monitoring system, it supports distributed and lossless
packet level monitoring of Ethernet links for 1 or 10 Gbps.
Beside providing sufficient resources for switching and
routing at 1 or 10 Gbps, the design of SGA-GPLANAR [10]
and SGA-10GED [11] used in SGA-7N includes some special,
network monitoring-related requirements, namely
– lossless packet capture,
– 64-bit time-stamping with sub-microsecond resolution,
– header-only capture: configurable depth of decoding,
– on-the-fly packet parsing by hardware [12],
– parameterized packet/flow generator for mass testing
[13],[14].
Various applications then require other supported function-
alities. As an example, the high-speed monitoring application
[15] consists of the following sub-modules:
– time-stamping every frame upon reception;
– packet decoding from layer 2 up to the application layer;
– packet filtering with a reconfigurable rule-set to decide
what we do with a given packet;
– packet chunking: packets can be truncated depending on
the matching rule;
– packet distribution: to distribute packets by different
criteria: IP flows, fragment steering, steering based on
mobile core network parameters, etc.;
– packet encapsulation: monitoring information is stored in
a specified header format.
These features and capabilities make the FPGA a suitable
enabler of hardware acceleration within the Monitors.
III. CHALLENGES AND REQUIREMENTS IN DETAIL
For a distributed monitoring solution described in the pre-
vious sections, there is a strong requirement for having a
monotonic clock. Otherwise, packet reordering would happen
even with a single monitoring node (changing its clock) – and
this is not feasible, since traffic analysis is heavily dependent
upon packet timestamps. As a consequence, the need for
monotonic system time is inherent.
Another challenge comes from the fact that a distributed
monitoring system has its components geographically sep-
arated from each other, therefore the clock frequency and
the time information of the clocks of the nodes have to be
frequency- and phase-synchronized to each other with some
given threshold. This problem has many solutions, e.g., using
GPS based synchronization systems [16]. Although technically
it can work well [17], as a drawback, this requires additional
installation expenditures on an indoor site that has no installed
antenna system to carry the GPS signal inside the building
and could also result in extensive cabling work. A convenient
alternative is to use network time synchronization that utilizes
the telecommunication network for exchanging packets as
per a designated protocol to achieve frequency and phase
synchronization. Examples for this are Network Time Protocol
(NTP) [1] and Precision Time Protocol (PTP) [2].
When speaking about time synchronization, the following
properties describe a clock – which are in-line with the generic
definition of clock properties [18]:
• accuracy – i.e. how good is the time information com-
pared to some reference
• precision – i.e. how precise is a tick of the clock com-
pared to some reference
• stability – i.e. how does the clock frequency change e.g.,
over time or based on external temperature changes etc.
The biggest challenge of all – as usual – is to adapt to the
existing monitoring framework described in II with minimal
modifications to the existing solution, while satisfying all the
precision and accuracy related requirements. As mentioned
before, the platform for proof-of-concept is the SGA-7N
monitoring system, which utilizes FPGA-based monitoring
cards. These are capable of capturing on high-speed network
interfaces – with fine-grained time-stamping capabilities –, and
they have their own, existing time-keeping facilities.
In order to tackle all the above mentioned issues with a
solution fitting into the network monitoring architecture, we
suggested to create a new FPGA-based card that implements
these functions:
• network time synchronization,
• local time synchronization,
• interfacing with the existing nodes – OAMP functions.
The following sections describe this solution, and show its
feasibility in the running monitoring system.
IV. ARCHITECTURE OF THE DISTRIBUTED TIME
SYNCHRONIZED MONITORING SYSTEM
A. Generic concept
For providing easy adaptation into the existing system,
and also taking into account FPGA resource usage, a hybrid
solution has been designed. This solution implements network
time synchronization in a standalone card that distributes the
digital timing information over a dedicated control bus, as
illustrated by Figure 2.
The synchronization framework provides a platform-
independent agent that can be integrated into the existing
FPGA cards’ top level VHDL (VHSIC Hardware Description
Language, [19]) modules, and is used through a well-defined
and portable interface.
The agent itself has low complexity, and as a result, the
solution does not waste CLB (Configurable Logic Block)
resources – as if the whole network synchronization stack
were instantiated N times on all monitoring node cards.
Furthermore, this results in better internal synchronization
compared to the replicated stacks, since those can have skew
to each other (within the boundaries), as specified by their
protocol.
As shown by Figure 2, each node has its own network
synchronization function, therefore the accuracy and precision
Fig. 1. A generic architecture for distributed network monitoring
II. NETWORK MONITORING SUPPORTED BY FPGA-BASED
PROBES
A. The Generic Concept of Distributed Network Monitoring
The distributed network monitoring architecture depicted
by Figure 1 supports local, probe-based pre-processing
(time-stamping, requirement-based packet chunking, filtering
criteria-based distribution) and central, deep analysis (corre-
lation of messages and transactions, data record compilation,
statistics generation), even on-the-fly. The time-based ordering
and interleaving of messages are enabled by the hardware-
accelerated time-stamping, providing nanosecond-range reso-
lution with sub-microsecond precision. The information stored
locally at the distributed Monitoring Probes can be accessed
by client applications of the operator. Besides, the Monitoring
Probes send pre-digested data to the Servers for correlation
(creating e.g. Call Data Records, CDRs), as well as periodic
reports containing their calculated statistics [3].
Since user data and control data are often carried over
the same channels, their division requires message analysis
on network- or transaction-level (e.g., IP- or TCP-level). The
changing traffic patterns force the operators to look for new
tools to process even the user traffic. The first step towards this
is the compilation of XDRs (eXtended Data Records) based on
control- and user-plane messages and transactions. These often
contain message-level timestamps, as well. Based on these
data, the deep traffic analysis tools provide valuable informa-
tion towards business-intelligence and network optimization.
Besides, all nodes can be configured to report directly to the
NOC (Network Operations Center).
Operators use the network-wide, passive monitoring for
fault detection, service quality assurance, and resource plan-
ning, among others [4]. Besides lossless data capture, network
monitoring covers further functions, as well:
– precise time-stamping, ordering;
– compilation, search and fetch of Call Data Records
(CDRs) and Extended Data Records (XDRs);
– calculation and reporting of Key Performance Indicators,
KPIs;
– Call Tracing at various complexity levels;
– bit-wise message decoding for protocol analysis; etc.
All these functions are present in the network monitoring
practice, since beside user-level data analysis, network analysis
is important from connection-level to application-level, as
well.
System elements of the described generic architecture can
be implemented in many ways. In the SGA-7N system –
which serves as the base implementation for the presented
solution – monitoring probes of the presented system are called
“Monitors”. These consist of three main building blocks: a
high performance Field Programmable Gate Array (FPGA)-
based custom hardware platform, a firmware dedicated for
network monitoring, and the probe software [5].
B. FPGA-based packet processing
There are many features that make FPGAs useful in packet
processing tasks [6]. The main concept itself allows parallel
processing of the input data. Different, simultaneous tasks
can be carried out at each clock cycle on the same data,
Time Synchronization Solution for FPGA-based
Distributed Network Monitoring
MARCH 2018 • VOLUME X • NUMBER 14
INFOCOMMUNICATIONS JOURNAL
Fig. 2. Fitting the time synchronization function into the generic, distributed
network monitoring concept
between two monitoring nodes can be guaranteed only to an
extent that the utilized time synchronization protocol provides.
Due to the uncompensated delay of routers, switches and
transmission paths, this is in the magnitude of milliseconds
of a software implementation of NTP. This precision, can
be increased by using FPGAs for hardware acceleration.
Depending on the PTP version and the underlying network
capabilities, this can fall into the magnitude of nanoseconds.
The main idea of the solution is to install a local time-
distribution bus between the nodes within a site. This allows
us to achieve nanosecond-range synchronicity, as there is
less perturbation between the hardware implementations of
the transmitting and receiving ends – no OS scheduler, no
network etc. Moreover, frequency synchronization can also be
easily achieved by implementing a synchronous bus – i.e.,
transmitting the clock signal along with the data.
B. External time synch. subsystem design and implementation
When selecting the candidate for implementing the external
time synchronization function, three protocols were consid-
ered:
• Network Time Protocol (NTP) [1],
• Precision Time Protocol v1 (PTPv1) [20],
• Precision Time Protocol v2 (PTPv2) [2].
In order to achieve the best synchronization between PTPv2
clocks, the protocol requires PTPv2-enabled switches/routers
throughout the network. These do the bookkeeping of the
processing delay values in the synchronization packets as
they traverse through the network. Without this feature, the
achievable synchronicity in a multi-hop network is around the
same as by using PTPv1.
Since PTPv2 is not widely available in current networks,
we concluded in either selecting NTP or PTPv1, due to their
simplicity. PTPv1 has way more modes of operation when
compared to NTP. Still, these two protocols are operating
based on semantically the same principle when determining
the round trip time and offset compared to a reference clock
entity. Although there are significant differences originated
from their packet structure, the time-stamp format and also
the epoch that could result in more complex implementation
if PTPv1 would be chosen. Still, the NTP time-stamp format
includes a 32-bit unsigned seconds field spanning 136 years
and a 32-bit fraction field resolving 232 picoseconds the prime
epoch, or base date of era 0, is 0 h 1 January 1900 UTC –
i.e., when all bits are zero.
Based on the requirements, the above considerations, and
the Occam principle, the design decision led to selecting
NTP protocol to be used for synchronizing the FPGA-based
monitoring cards through a card that is responsible for im-
plementing the external and internal (see Section IV-C) time
synchronized function called SGA-Clock.
Each FPGA-based packet processing and networking proto-
col implementation has its own complexity. There are several
readily available implementations that can be used for packet
processing in FPGAs with some limited flexibility when it
comes to interconnecting it with other modules. The one that
has been used for the current implementation is a flexible
solution for Protocol Implementations within FPGAs. The
solution detailed in [21] provides a generic framework in
VHSIC Hardware Description Language (VHDL) that enables
rapid prototyping of networking protocols. Among many other
things it provides the following main features:
– supports protocol module interconnection via layering;
– handles reception and transmission of Protocol Data Units
(PDUs) with queuing;
– provides a high level interface for separating and combin-
ing Protocol Control Information (PCI) and Service Data
Unite (SDU), forwarding, pausing or dropping SDUs;
– provides a unified way to handle Interface Control In-
formation (ICI), SDU, and PDU events (e.g., error sig-
nalling) [22];
– adds support of auxiliary information that travels along
with messages
– provides components for common tasks recurring during
implementing networking protocols (de/serialization, ar-
bitration etc.).
Protocol Under Implementation (PUI)





































Fig. 3. Fundamental building block of the FPGA networking framework used
for the Protocol Implementation
The framework’s basic building block (shown by Figure 3)
was used for implementing a pure FPGA-based UDP/IP proto-
col stack with ARP [23] on top of 802.3 Ethernet. It provides
a platform with deterministic timing for the likewise FPGA-
based implementation of NTP. For each of these protocols the
corresponding protocol-specific parts have been described in
VHDL, using the generic framework [21].
NTP module
NTP Clock Discipline Module
NTP ClockFilter
NTP Poller







Fig. 4. NTP module block diagram of components
The internal structure of the NTP module is shown by
Figure 4. The NTP Poller component is responsible for the
NTP packet transmission and reception, and implementing
the On-Wire protocol for determining the offset – based on
the packet messages. The packet-handling part is also im-
plemented through the Protocol Implementations framework.
The NTP ClockFilter component is there to regulate the offset
values presented by the poller by ordering the results based on
delay, updating internal state variables, calculating jitter, and
suppressing spikes based on jitter and last successful test time.
If the offset data got passed the filter stage, it gets forwarded
for further processing by the NTP Discipline module.
The NTP Discipline module controls the clock module – by
adjusting the time increment – based on the filtered offset data.
The NTP clock module provides an interface for controlling
the time increment that itself is added to the clock register
in each system clock cycle – thus implementing the clock
functionality. The time information is fed back to each module
as illustrated on Figure 4. This chain of modules with the
feedback is another realization of a closed loop control chain
described in the following section.
C. Internal time synch. subsystem design and implementation
Since time-stamping is done by the monitoring interface
cards, the time synchronization information has to be spread
around all interface cards of all monitoring units within the
site. This time synchronization is an internal matter of the
monitoring system. The relationship between “external” and
“internal” time synchronization is shown by Figure 5.
The internal time information synchronization function is
responsible for having all clocks in all monitoring functions to
be completely synchronized within a monitoring node. Since
this is an internal component, the amount of perturbation
that potentially affects this subsystem is considered minimal
compared to the external time synchronization subsystem.
The elements of this subsystem are:
Fig. 5. Time synchronization within a monitoring site – methods for external
and internal subsystems differ to allow high precision and accuracy in time-
stamping
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
All ones ST time/data
time/data
time/data CRC-8
Fig. 6. High-Speed Time-stamp Interface frame format
• digital bus that is able to transmit time and status infor-
mation;
• a driver module of that bus that resides in the Network
clock synchronization function;
• receiver modules attached to that bus performing local
time synchronization.
Internally to each monitoring probe, all FPGA boards that
implement a monitoring function can operate from different
power supply units. As a consequence, ground level isolation
is necessary over the bus. For reducing the physical layer com-
plexity, a point-to-point bus system has been designed. In order
to be able to maximize the number of clients connected the
bus, it utilizes an asynchronous serial communication using 2
wires that provides uni-directional communication – with this
system bi-directional communication would require 4 wires.
The communication protocol executed by the driver module
(the internal time synch. distribution module in Figure 5)
multiplexes arbitrary data units and the time information over
the bus into frames – equipped with error detection code – in
an alternating pattern. That results in periodic transmission of
valid time information.
The parameters of the physical signalling are:
• LVCMOS33 (Low Voltage CMOS 3.3) levels for repre-
senting logical values;
• asymmetric signal transmission;
• 15.625MHz clock frequency with 4x oversampling;
• NRZ line coding.
The frame format used on the bus is shown in Figure 6. The
Time Synchronization Solution for FPGA-based
Distributed Network Monitoring
INFOCOMMUNICATIONS JOURNAL
MARCH 2018 • VOLUME X • NUMBER 1 5
Fig. 2. Fitting the time synchronization function into the generic, distributed
network monitoring concept
between two monitoring nodes can be guaranteed only to an
extent that the utilized time synchronization protocol provides.
Due to the uncompensated delay of routers, switches and
transmission paths, this is in the magnitude of milliseconds
of a software implementation of NTP. This precision, can
be increased by using FPGAs for hardware acceleration.
Depending on the PTP version and the underlying network
capabilities, this can fall into the magnitude of nanoseconds.
The main idea of the solution is to install a local time-
distribution bus between the nodes within a site. This allows
us to achieve nanosecond-range synchronicity, as there is
less perturbation between the hardware implementations of
the transmitting and receiving ends – no OS scheduler, no
network etc. Moreover, frequency synchronization can also be
easily achieved by implementing a synchronous bus – i.e.,
transmitting the clock signal along with the data.
B. External time synch. subsystem design and implementation
When selecting the candidate for implementing the external
time synchronization function, three protocols were consid-
ered:
• Network Time Protocol (NTP) [1],
• Precision Time Protocol v1 (PTPv1) [20],
• Precision Time Protocol v2 (PTPv2) [2].
In order to achieve the best synchronization between PTPv2
clocks, the protocol requires PTPv2-enabled switches/routers
throughout the network. These do the bookkeeping of the
processing delay values in the synchronization packets as
they traverse through the network. Without this feature, the
achievable synchronicity in a multi-hop network is around the
same as by using PTPv1.
Since PTPv2 is not widely available in current networks,
we concluded in either selecting NTP or PTPv1, due to their
simplicity. PTPv1 has way more modes of operation when
compared to NTP. Still, these two protocols are operating
based on semantically the same principle when determining
the round trip time and offset compared to a reference clock
entity. Although there are significant differences originated
from their packet structure, the time-stamp format and also
the epoch that could result in more complex implementation
if PTPv1 would be chosen. Still, the NTP time-stamp format
includes a 32-bit unsigned seconds field spanning 136 years
and a 32-bit fraction field resolving 232 picoseconds the prime
epoch, or base date of era 0, is 0 h 1 January 1900 UTC –
i.e., when all bits are zero.
Based on the requirements, the above considerations, and
the Occam principle, the design decision led to selecting
NTP protocol to be used for synchronizing the FPGA-based
monitoring cards through a card that is responsible for im-
plementing the external and internal (see Section IV-C) time
synchronized function called SGA-Clock.
Each FPGA-based packet processing and networking proto-
col implementation has its own complexity. There are several
readily available implementations that can be used for packet
processing in FPGAs with some limited flexibility when it
comes to interconnecting it with other modules. The one that
has been used for the current implementation is a flexible
solution for Protocol Implementations within FPGAs. The
solution detailed in [21] provides a generic framework in
VHSIC Hardware Description Language (VHDL) that enables
rapid prototyping of networking protocols. Among many other
things it provides the following main features:
– supports protocol module interconnection via layering;
– handles reception and transmission of Protocol Data Units
(PDUs) with queuing;
– provides a high level interface for separating and combin-
ing Protocol Control Information (PCI) and Service Data
Unite (SDU), forwarding, pausing or dropping SDUs;
– provides a unified way to handle Interface Control In-
formation (ICI), SDU, and PDU events (e.g., error sig-
nalling) [22];
– adds support of auxiliary information that travels along
with messages
– provides components for common tasks recurring during
implementing networking protocols (de/serialization, ar-
bitration etc.).
Protocol Under Implementation (PUI)





































Fig. 3. Fundamental building block of the FPGA networking framework used
for the Protocol Implementation
The framework’s basic building block (shown by Figure 3)
was used for implementing a pure FPGA-based UDP/IP proto-
col stack with ARP [23] on top of 802.3 Ethernet. It provides
a platform with deterministic timing for the likewise FPGA-
based implementation of NTP. For each of these protocols the
corresponding protocol-specific parts have been described in
VHDL, using the generic framework [21].
NTP module
NTP Clock Discipline Module
NTP ClockFilter
NTP Poller







Fig. 4. NTP module block diagram of components
The internal structure of the NTP module is shown by
Figure 4. The NTP Poller component is responsible for the
NTP packet transmission and reception, and implementing
the On-Wire protocol for determining the offset – based on
the packet messages. The packet-handling part is also im-
plemented through the Protocol Implementations framework.
The NTP ClockFilter component is there to regulate the offset
values presented by the poller by ordering the results based on
delay, updating internal state variables, calculating jitter, and
suppressing spikes based on jitter and last successful test time.
If the offset data got passed the filter stage, it gets forwarded
for further processing by the NTP Discipline module.
The NTP Discipline module controls the clock module – by
adjusting the time increment – based on the filtered offset data.
The NTP clock module provides an interface for controlling
the time increment that itself is added to the clock register
in each system clock cycle – thus implementing the clock
functionality. The time information is fed back to each module
as illustrated on Figure 4. This chain of modules with the
feedback is another realization of a closed loop control chain
described in the following section.
C. Internal time synch. subsystem design and implementation
Since time-stamping is done by the monitoring interface
cards, the time synchronization information has to be spread
around all interface cards of all monitoring units within the
site. This time synchronization is an internal matter of the
monitoring system. The relationship between “external” and
“internal” time synchronization is shown by Figure 5.
The internal time information synchronization function is
responsible for having all clocks in all monitoring functions to
be completely synchronized within a monitoring node. Since
this is an internal component, the amount of perturbation
that potentially affects this subsystem is considered minimal
compared to the external time synchronization subsystem.
The elements of this subsystem are:
Fig. 5. Time synchronization within a monitoring site – methods for external
and internal subsystems differ to allow high precision and accuracy in time-
stamping
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
All ones ST time/data
time/data
time/data CRC-8
Fig. 6. High-Speed Time-stamp Interface frame format
• digital bus that is able to transmit time and status infor-
mation;
• a driver module of that bus that resides in the Network
clock synchronization function;
• receiver modules attached to that bus performing local
time synchronization.
Internally to each monitoring probe, all FPGA boards that
implement a monitoring function can operate from different
power supply units. As a consequence, ground level isolation
is necessary over the bus. For reducing the physical layer com-
plexity, a point-to-point bus system has been designed. In order
to be able to maximize the number of clients connected the
bus, it utilizes an asynchronous serial communication using 2
wires that provides uni-directional communication – with this
system bi-directional communication would require 4 wires.
The communication protocol executed by the driver module
(the internal time synch. distribution module in Figure 5)
multiplexes arbitrary data units and the time information over
the bus into frames – equipped with error detection code – in
an alternating pattern. That results in periodic transmission of
valid time information.
The parameters of the physical signalling are:
• LVCMOS33 (Low Voltage CMOS 3.3) levels for repre-
senting logical values;
• asymmetric signal transmission;
• 15.625MHz clock frequency with 4x oversampling;
• NRZ line coding.
The frame format used on the bus is shown in Figure 6. The
corres onding protocol-specific parts hav been described in
VHDL, using the generic framework [21].
NTP module
NTP Clock Discipline Module
NTP ClockFilter
NTP Poller







Fig. 4. NTP module block diagram of components
The internal struc ure of the NTP module is shown by
Figure 4. The NTP Poller component i resp nsible for the
NTP packet tra smission and reception, and implementing
the On-Wire protocol for determining he offs t – based on
the packet messages. The packet-handling part is also im-
plemented rough the Protocol Implement tions framework.
The NTP ClockFilter component is there o regulate he offset
values presented by the p ll r by ordering the results based on
delay, updating internal stat variables, calculating jitter, and
suppressing spikes bas d on jitter and la t successful test time.
If he offse d ta go passed the filter stage, it gets forwarded
for further processing by the NTP Discipline module.
The NTP Discipline modu e contr ls the clock module – by
adjusting the ti increm nt – based on the filt red offset data.
The NTP clock m dul provid s n interface for controlling
the ti increment that itself is added to the clock register
in each system ock cycle – thus implementing the clock
functional ty. The ti e information is fed back t each module
s illustrated on Figure 4. This chain of modules with the
feedback is another realiz tion of a closed lo p control chain
described in the followi g section.
C. Internal time synch. subsystem design and implementation
Since time-stampi g is done by the monito ing interface
cards, the time synchronization information has to be spread
around ll interface cards f all monitoring units within the
ite. This time synchroniz tion is an internal matter of the
monitoring system. The relationship betw en “external” and
“internal” time synchronization is shown by Figure 5.
The internal ti e information synchronization function is
resp nsible for having all clocks in all monitor ng functions to
be completely synchronized within a mo itoring ode. Since
this is an internal component, the amoun of perturbation
that potentially affects this sub ystem is considered minimal
compared o the ext rnal time synchronization subsystem.
The elements of this subsystem are:
Fig. 5. Time synchronization within a monitoring site – m thods for external
and internal subsystems differ to allow high precision and accuracy in time-
stamping
0 1 2 3 4 5 6 7 8 9 10 1 12 13 14 15 16 17 18 19 20 21 2 23 24 25 26 27 28 29 30 31
All ones ST time/data
time/data
time/data CRC-8
Fig. 6. High-Speed Time-stamp Interface frame format
• digit l bus that is able to ransmit time and status infor-
mation;
• a driver module of t t bu that resides in the Network
clock synchronizati function;
• receiver modul s attac ed to that bus perf rming local
time synchronization.
Intern lly to each monitoring probe, all FPGA boards that
imple en a monitoring function can operate rom different
ower supply units. As a consequence, ground level isolation
is necessary over the bus. For reducing the physi al layer com-
plexity, a -to-point bus syst m has been designed. In order
to be able to maximize the numb r of clients connected the
bus, it utilizes an asynchronous serial communication using 2
wires that provides uni-directional communication – with this
system bi-directional communication would require 4 wires.
The communication protocol execut d by the driver module
(the internal time synch. distribution module in Figure 5)
multiplexes arbitrary data units and the ti e information over
the bus into frames – equipped with rror detection code – in
n alternati g patte n. That results in periodic tra smission of
valid ti e information.
The paramet rs of the physical signalling are:
• LVCMOS33 (Low Voltage CMOS 3.3) levels for repre-
senting logical values;
• asymmetric signal tra smission;
• 15.625MHz clock frequency with 4x oversampling;
• NRZ line coding.
The frame format used on the bus is shown in Figure 6. The
c rresponding protoc l-specific p rts have been d scribed in
VHDL, usin the gen ric framework [21].
NTP module
NTP Clock Dis ipline Module
NTP ClockFi ter
NTP Poller







Fig. 4. NTP module block diagr of components
The internal struct re of the NTP module is hown by
Figure 4. The NTP oller compone t is responsible for the
NTP packet transmiss on and reception, and implementi g
the On-Wire prot c l fo determining th offset – based on
the packet messages. The packet-handling part is also im-
pl mented t rough t e Protoc l Implementations framework.
he NTP C ockFilter compone t is there to r gulate he offset
values presented by the polle by ordering the r sults based on
delay, updating internal state vari bles, cal ula ing jitter, and
suppress ng spik s based on jitter and last ucces ful tes time.
If the offset data got passed the filter s a , it gets forwarde
for further processing by the NTP Disc pl ne module.
he NTP Disc pl ne module controls the cl ck module – by
adjusting the time increment – based on th filtered offset data.
he NTP cl ck module provides an i terface fo co trolling
the tim increment that self is adde to the clock register
in each system clo k cy le – thus implementi g the clock
func ionality. The time inf rmation is fed back to each module
as illustra ed on Figure 4. Th s chain of modules with the
feedback is another realization of a c sed loop control chain
d scribed in the foll wing section.
C. Internal time synch. subsystem design and implem tation
S nce i e-stamping is done by the monitoring interface
cards, the time synchronizati n i f rmation has to be spread
around all interface cards of all monitoring units with n the
site. This time synchronizatio s an internal matter of the
monitoring system. The relationship between “exter al” and
“internal” time synchronization is shown by Figure 5.
The internal time inf rmation synchronization function is
responsible for h ving all c ocks in all monitoring functions to
be completely synchronized with n a monitoring node. Since
this is an internal compone t, he am unt of pertu bation
that potentially affects his subsy tem is considered minimal
compared to he external time synchronization subsystem.
The elements of this subsyst m are:
Fig. 5. Time synchronization within a monitor ng si e – methods for ext rnal
a d internal sub ystems differ to allow hig precision a d accuracy in time-
stamping
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 16 17 18 19 0 1 2 3 4 5 26 27 28 29 30 31
All ones ST time/data
time/data
time/data CRC-8
Fig. 6. High-Speed Time-s amp Interface f ame format
• digital bus that is able to rans it time nd status infor-
mation;
• a driver m dule of tha bus that resides in the Network
clock synchronization function;
• receiv r modules att ched o that bus perfo ming local
time synchronization.
Internally to each monito ing probe, all FPGA boards that
implement a monitoring fun tion can operate from different
power s pply units. As a consequence, ground level isolation
i necessa y over the bus. For reducing the physical layer com-
plexity, a point-to point bus system has been designed. In order
to be able to maxim ze the number of clie ts onnect d the
b s, it u ilize an asynchron us serial communicat on using 2
wires that provides uni-d recti nal communication – with is
system bi-d recti nal communication would require 4 wires.
The communication protoc l x cuted by th driver module
( he internal time synch. distributi n module in Figure 5)
multiplex s arbitrary data units and the time inf rmation over
the bus into frames – equipped with erro detection code – in
a alternating pattern. Tha results in periodic transmiss on of
valid time inf rmation.
The parameters of the physical sign lling are:
• LVCMOS33 (Low Voltage CMOS 3.3) levels for repre-
senti g ogical values;
• asymmetric sig al transmiss on;
• 15.625MHz clock frequency with 4x oversampling;
• NRZ line coding.
Th fra e format used on the bus i shown i Figure 6. The
corresponding prot col-specific parts have een descr bed in
VHDL, using the g neric framework [21].
NTP module
NTP Clock Discipline Module
NTP ClockFilter
NTP Poller







Fig. 4. NTP module block di gram of components
Th internal st ucture of the NTP module i shown by
Figure 4. The NT P ll r compo e t is responsible for the
NTP packet transm ssion and r ception, and i pleme ting
the On-Wire prot col for determining the offset – based on
the p ck t m sages. T e packet-handling part is also im-
plemented through the Prot col I plementations framework.
The NTP ClockFilter compo ent is the o regula e the offset
valu pr s nted by the poller by ordering th results based on
delay, updating internal st te v riables, c lculating jitter, and
su pr ing pikes based on jitter nd la t successful test time.
If the offset da a got pass d th filter stage, it g ts forwar ed
for further processing by the NTP Disc pline module.
The NTP Disc pline m dule con rols the clock module – by
adjusting th time increment – based on the filtered offset data.
The NTP clock module provides an int r ace for controlling
th t me increm n tha itself is ad ed to the clock register
in each ystem clock ycle – thus i pleme ing the clock
fu ct onality. Th time information is f d back to ach module
as illus rated on Figure 4. T is chain f modules with the
feedback is ano her realization of a closed loop control chain
descr bed in the f ll wing section.
C. Inter al time synch. subsyst m design and i plementation
Since time- tamping is done by the mo itoring int rface
cards, th time synchronization information has to b spread
around all int rface cards of all mo itori g un s w in the
site. This time synchronization is an intern l matter of the
mo itoring ystem. The rela ionship between “external” and
“internal” time synchronization is shown by Figure 5.
Th internal t me information synchro ization fu ction is
responsible for having al clocks in all mo i ori g fu cti ns to
be completely synchro ized w thin a m nitoring node. Since
this is an inter al compo ent, he amount of per urbation
that pot nti lly affec s his sub ystem is cons dered inimal
compared to th exter al time synchroniza ion sub ystem.
The lements of hi sub ystem are:
Fig. 5. Time synchro izati within a monitoring site – m thods for xternal
and internal subsystems differ to allow igh precisio a d accuracy in time-
stamping
0 1 2 3 4 5 6 7 8 9 0 2 13 14 15 16 17 18 19 0 1 23 24 25 26 27 28 29 30 31
All ones ST time/data
time/data
time/data CRC-8
Fig. 6. High-Speed Ti e-stamp Interfac fra e format
• digital bus th t is able to trans it time and status infor-
mation;
• a driver module of that bus that res des in the Network
cl ck synchro ization fu ction;
• r ceiver modules tached to that bus performing local
time synchronization.
Internally to each mo itoring probe, all FPGA boards that
i plement a mo i oring fu ction can operat from different
power upply units. As a consequenc , ground l vel isolation
is n cessary over th bus. For reducing the physical layer com-
plexity, a p int-to-point bu ystem has b en d signed. In order
b able to ax mize the number of clients connect d the
bus, it utilizes a asynchr n us serial co municatio using 2
wires that provides uni-directio al co mun cation – wit this
ystem bi-directio al co municati n would quire 4 wires.
The co munication prot col xecuted by the driver module
(th internal time synch. distrib tion mod l in Figure 5)
multipl xes rbitr ry data units and th time information over
the bus into fram s – equipped with error detection code – in
an al ernating pattern. That results in periodic transm ssion of
valid time information.
The parameters of the phys cal signalling are:
• LVCMOS33 (L w Voltage CMOS 3.3) l v ls fo repre-
se ting logical values;
• a y metric signal transm ssion;
• 15.625MHz clock frequency with 4x oversampling;
• NRZ line coding.
The fram format used on the bu is show in Figure 6. The
Time Synchronization Solution for FPGA-based
Distributed Network Monitoring
MARCH 2018 • VOLUME X • NUMBER 16
INFOCOMMUNICATIONS JOURNAL
Fig. 7. Internal clock module diagram
frame starts with an all 1’s preamble, and it is followed by a
start bit with value 0. The type field is used to differentiate
the payload types. When T=1 it indicates that the payload is
time information otherwise it is data – hence an overlay data
communication protocol can be used on this data channel. The
time format is in line with the external time synchronization,
i.e., it uses the NTP time format for representing the time
information. For detecting transmission errors on the bus, a
CRC-8 value is calculated for the ‘Type’ and ‘Payload’ fields
and appended to the frame that is checked on frame reception
for detecting transmission errors.
Tframe = Nbits × Tbit (1)
and
Tbit = 1/fsignalling (2)
Where Tbit is the bit time, fsignalling is the signalling fre-
quency on the internal bus, and Nbits is the number of bits
in the frame. Calculating (1) and (2) with the above given
parameters, the frame time is 90/15.625MHz = 5.76 µs. Since
every other frame carries time information, the clock on
the receiver side can be disciplined/controlled on a 11.52 µs
basis. Under such short period of time even low quality, non-
temperature controlled crystal oscillators have negligible drift.
As a result, this update period is adequate for the network
monitoring use case.
The receiver module (i.e., in the FPGA-based Clock Card
in Figure 5) de-multiplexes the data and the time information
from the payload of the received frame. It also verifies that
the received frame’s CRC-8 value matches the calculated one.
If no errors were detected then it feeds the time information
into a module that performs time synchronization executing
the pseudo-code in algorithm 1
The client clock module – as shown in Figure 7 – is
incrementing a clock counter with an increment value –
corresponding to the nominal clock frequency in the internal
time representation – in each system clock period. The clock
module frequency can be adjusted through modifying the time
increment itself. The Delaystatic constant can be measured for
a given configuration and adjusted accordingly. The algorithm
is illustrated by a sample waveform of the master and slave
entity in Figure 8. Informally if the skew is less than the
desired precision under one synchronization period – i.e. when




for Each rising edge of system clock do
if Received valid time-stamp from Master then
Tlocal ⇐ Trecv + Increment+Delaystatic
else
Tlocal ⇐ Tlocal + Increment
end if
end for
the valid time is transmitted by the master entity – then the
phase of time progresses in sync on the two entities.
Fig. 8. Illustration of timing on the internal synchronization bus
To concede that this system can have nanosecond synchro-
nization let us execute the algorithm: let Tn be the nth time
point where synchronization occurs between the master and
the slave entity. At time point Tn when the client receives a
time-stamp the local clock will be in sync with the master
clock – given that the Delaystatic constant was determined
correctly. De-synchronization arises due to errors in the master
and client clock oscillator frequency. To ensure that the desired
level of synchronization is reached it has to be shown that
the master and client clock would not diverge more than
one nanoseconds under time interval (Tn, Tn+1) – since by
definition at Tn+1 the clocks will be in sync again and
this process is periodic. Given a worst case calculation the
Fig. 7. Internal clock module diagram
frame starts with an all 1’s preamble, and it is followed by a
start bit with value 0. The type field is used to differentiate
the payload types. When T=1 it indicates that the payload is
time information otherwise it is data – hence an overlay data
communication protocol can be used on this data channel. The
time format is in line with the external time synchronization,
i.e., it uses the NTP time format for representing the time
infor ation. For detecting trans ission errors on the bus, a
CRC-8 value is calculated for the ‘Type’ and ‘Payload’ fields
and appended to the fra e that is checked on fra e reception
for detecting trans ission errors.
fra e bits bit (1)
and
bit sig alli g ( )
here Tbit is the bit ti , fsignalling is t si lli fr -
quency on the internal s, bits i t r f it
in the fra e. alculati ( ) ( ) it t i
para eters, the fra e ti e is / . . . i
every other fra e carries ti i f ,
the receiver side can be is i li / t
basis. nder such short eri f ti
te perature controlled cr st l s ill t
As a result, this update eri i
monitoring use case.
The receiver odule (i.e., i t
in Figure 5) de- ultiple es t t
fro the payload of the re i fr .
the received fra e’s - l t
If no e rors ere detecte t it f
into a odule that perf r s ti
the pseudo-code in alg rit
The client clock l
incre enting a clock c t r it
co responding to the n i l l i t i t l
ti e representation – in eac s t l i . l
module frequency can be a j st t r if i t ti
incre ent itself. he el st tic st t s r f r
a given configuration a a j st r i l . l rit
is illustrated by a sa le a ef r f t e aster a sla e
entity in Figure 8. Infor ally if the ske is less than the
desired precision under one synchronization period – i.e. hen




for Each rising edge of system clock do
if Received valid time-stamp from aster then
Tlocal Trecv + Incre ent+ elaystatic
else
Tlocal Tlocal Incre ent
end if
end for
t e ali ti e is tra s itte t e aster entity – then the
s f ti r r ss s i s t e t o entities.
ti i t e i ternal synchronization bus
i t a e nanosecond synchro-
t t l rit : let n be the nth ti e
i ti rs et een the aster and
i t en the client receives a
l ill e i sync ith the aster
l i t t t l st tic c stant as deter ined
tl . i ti ris s e to errors in the aster
li t l ill t r fr . ensure that the desired
l l f s r i ti is r it has to be sho n that
t st r li t l l ot diverge ore than
e a sec s er ti e i ter al ( n, n 1) – since by
de nition at n 1 the clocks ill be in sync again and
this process is periodic. iven a orst case calculation the
Fig. 7. Internal clock module diagram
frame starts with an all 1’s preamble, and it is followed by a
start bit with value 0. The type field is used to differentiate
the payload types. When T=1 it indicates that the payload is
time information otherwise it is data – hence an overlay data
communication prot col can be used on this data channel. The
time format is in line with the external time synchronization,
i.e., it uses the NTP time format for representing the time
information. For detecting transmission errors on the bus, a
CRC-8 value is calculated for the ‘Type’ and ‘Payload’ fields
nd app nded to the frame that is checked on frame reception
for detecting transmission errors.
Tframe = Nbits × Tbit (1)
and
Tbit = 1/fsignalling (2)
Where Tbit is the bit time, fsignalling is the signalling fre-
quency on the internal bus, and Nbits is the number of bits
in the frame. Calculating (1) and (2) with the above given
parameters, the frame time is 90/15.625MHz = 5.76 µs. Since
every other frame carries time information, the clock on
the receiver ide can be disciplined/controlled on a 11.52 µs
basis. Under such short period of time even low quality, non-
temperatu e controlled crystal oscillators have negligible drift.
As a r sult, this update period is adequate for the network
monitoring use case.
The receiver module (i.e., in the FPGA-based C ock Card
in Figure 5) de-multiplexes the data and the time information
from the payload of the received frame. It also verifies that
the rec ived frame’s CRC-8 value matches the calculated one.
If o errors were dete ted th n it feeds the time information
nto a module that performs time synchronization executing
the pseudo-code in algorithm 1
The client clock module – as shown in Figure 7 – is
incrementing a clock coun er with an increment value –
corr sponding to the nominal clock fr quenc in the internal
time r presentation – in each stem clock period. The clock
module frequency can be adjusted through modifying the time
increment itself. Th Dela static constant can be measured for
a given configuration and adjusted accordingly. The algorithm
is illustrated by a sample waveform of the mas nd slave
entity in Figure 8. Inf rm lly if the skew is less than the
desired precision under ne ynchronization period – i.e. when




for Each rising edge of system clock do
if Received valid time-stamp from Master then
Tlocal ⇐ Trecv + Increment+Delaystatic
else
Tlocal ⇐ Tlocal + Increment
end if
end for
he valid ime is transmitted by the master entity – then the
pha e of time progresses in sync on the two entities.
Fig. 8. Illustration of timing on the internal synchronization bus
To concede that this system can have nanosecond synchro-
nization let us execute the algorithm: let Tn be the nth time
point where synchronization occurs between the master and
the slave ntity. At time point Tn when the client receives a
ime-stamp the local clock will be in sync with the master
clock – given that the Delaystatic constant was determined
correctly. D -synchronization arises due to errors in the master
and client clock oscillator frequency. To ensure that the desired
level of synchronization is reached it has to be shown that
the master and client clock would not diverge more than
one na oseconds under time interval (Tn, Tn+1) – since by
definition at Tn+1 the clocks will be in sync again and
this pr cess is periodic. Given a worst case calculation the
Fig. 7. Internal clock module diagram
frame starts with an all 1’s preamble, and it is followed by a
start bit with value 0. The type field is used to differentiate
the payload types. When T=1 it indicates that the payload is
time information otherwise it is data – hence an overlay data
communication protocol can be used on this data channel. The
time format is in line with the external time synchronization,
i.e., it uses the NTP time format for representing the time
information. For detecting transmission errors on the bus, a
CRC-8 value is calculated for the ‘Type’ and ‘Payload’ fields
and appended to the frame that is checked on frame reception
for detecting transmission errors.
Tframe = Nbits × Tbit (1)
and
Tbit = 1/fsignalling (2)
Where Tbit is the bit time, fsignalling is the signalling fre-
quency on the internal bus, and Nbits is the number of bits
in the frame. Calculating (1) and (2) with the above given
parameters, the frame time is 90/15.625MHz = 5.76 µs. Since
every other frame carries time information, the clock on
the receiver side can be disciplined/controlled on a 11.52 µs
basis. Under such short period of time even low quality, non-
temperature controlled crystal oscillators have negligible drift.
As a result, this update period is adequate for the network
monitoring use case.
The receiver module (i.e., in the FPGA-based Clock Card
in Figure 5) de-multiplexes the data and the time information
from the payload of the received frame. It also verifies that
the received frame’s CRC-8 value matches the calculated one.
If no errors were detected then it feeds the time information
into a module that performs time synchronization executing
the pseudo-code in algorithm 1
The client clock module – as shown in Figure 7 – is
incrementing a clock counter with an increment value –
corresponding to the nominal clock frequency in the internal
time representation – in each system clock period. The clock
module frequency can be adjusted through modifying the time
increment itself. The Delaystatic constant can be measured for
a given configuration and adjusted accordingly. The algorithm
is illustrated by a sample waveform of the master and slave
entity in Figure 8. Informally if the skew is less than the
desired precision under one synchronization period – i.e. when
Algorithm 1 Receiver local time disciple algorithm
Incre ent ⇐ 1/fclk
Delaystatic ⇐ x
Tlocal ⇐ 0
for Each rising edge of system clock do
if Received valid time-stamp from Master then
Tlocal ⇐ Trecv + Increment+Delaystatic
else
Tlocal ⇐ Tlocal + Increment
end if
end for
the valid time is transmitted by the master entity – then the
phase of time progresses in sync on the two entities.
Fig. 8. Illustration of timing on the internal synchronization bus
To concede that this system can have nanosecond synchro-
nization let us execute the algorithm: let Tn be the nth time
point where synchronization occurs between the mas er and
the slave entity. At time point Tn when the client receives a
time-stamp the local clock will be in sync with the master
clock – given that the Delaystatic constant was determined
correctly. De-synchronization arises due to errors in the master
and client clock oscillator frequency. To ensure that the desired
level of synchronization is reached it has to be shown that
the master and client clock would not diverge more than
one nanoseconds under time interval (Tn, Tn+1) – since by
definition at Tn+1 the clocks will be in sync again and
this process is periodic. Given a worst case calculation the
Fig. 7. Internal clock module diagram
frame starts with an all 1’s preamble, and it is followed by a
start bit with value 0. The type field is used to differentiate
the payload types. When T=1 it indicates that the payload is
time information otherwise it is data – hence an overlay data
communication protocol can be used on this data channel. The
time format is in line with the external time synchronization,
i.e., it uses the NTP time format for representing the time
information. For detecting transmission errors on the bus, a
CRC-8 value is alculat for the ‘Type’ and ‘Payload’ fields
and appended to the frame that is checke on frame reception
for detecting tr nsmission errors.
Tframe Nbits × Tbit (1)
and
Tbit = 1/fsignalling (2)
Where Tbit is the bit time, fsignalling is the signalling fre-
quency on the internal bus, nd Nbits is the nu ber of bits
in th frame. Calculat ng (1) and (2) with the above given
parameters, the frame time is 90/15.625MHz = 5.76 µs. Since
every other frame carries time information, the clock on
the receiver side can be disciplined/controlled on a 11.52 µs
basis. Under such short period of time even low quality, non-
temperature controlled crystal oscillators have negligible drift.
As a result, this update period is adequate for the network
monito ing use case.
The receiver module (i.e., i the FPGA-based Clock Card
Figure 5) de-multiplexes the data and the time informatio
from th payload of the received frame. It also verifies that
th receiv d frame’s CRC-8 value matches the calculated one.
If no errors were detecte then it feeds the time information
into a module t at performs time synchronization executing
he seudo-code in algo ithm 1
The client clock module – a shown in Figure 7 – is
incrementing a clock counter with an increment value –
corresponding to the nominal clock frequency in the internal
time representation – in each system clock period. The clock
m dule frequency can b adjust through modifying the time
increm nt itself. The Delaystatic constant can be measured for
a given c nfigu ation and adjusted accordingly. The algori hm
s illustrated by a sample waveform of the m ster and slave
entity in Figur 8. Informally if the skew is less than the
desired pr cision under one synchronization period – i.e. when




for Each rising edge of system clock do
if Received valid time-stamp from Master then
Tlocal ⇐ Trecv + Increment+Delaystatic
else
Tlocal ⇐ Tlocal + Increment
end if
e d for
the valid time is t ansmitted by the mast r entity – then the
phase of time progresses in sync on the two entities.
Fig. 8. Illustration of timing on the internal synchronization bus
To concede that this system can have nanosecond synchro-
nization let us execute the algorithm: let Tn be the nth time
point where synchronization occurs between the master and
the slave entity. At time point Tn when the client receives a
time-stamp the local clock will be in sync with the master
clock – given tha the Delays atic consta t was determined
correctly. De-synchronization arises due to errors in the master
and client clock oscillator frequency. To ensure that the desired
level of synchronization is reached it has to be shown that
the master and client clock would not diverge more than
one na oseconds und r time interval (Tn, Tn+1) – since by
definition at T +1 the clocks will be in sync again
is process is periodic. Given a orst case calculation the
Fig. 7. Internal clock module diagram
frame starts with an all ’s preamble, and it is fol wed by a
start bit with value 0. The type field is used to differentiate
the payl ad types. When T=1 it i dicates that the payload is
time information otherwise it is data – hence an overlay data
communication protocol can be used on this data channel. The
time format is in line with the external time synchroniz tion,
i.e., it uses the NTP time format for representing the time
information. For det c ing transmission e r rs on the bus, a
CRC-8 value is calculated for the ‘Typ ’ and ‘Payload’ fields
and append d to the frame hat is checked on frame reception
for detecting transm sion errors.
Tfram = Nbits × Tbit (1)
and
Tbit = 1/f ig alling (2)
Where Tbit is he bit time, fsignalling is the signalling fre-
quency on the in ernal bus, and Nbits is the number of bits
in the frame. Calculating (1) and (2) with the above given
parameters, the frame time is 90/15.625MHz = 5.76 µs. Since
every other frame carries time information, the cl ck on
the receiver sid can be dis iplined/controlled on a 11.52 µs
basis. Under such short period of time even low quality, non-
temperature controlled crystal oscillators have negli ibl drift.
As a result, his update period is adequate for the network
monitoring use c se.
The receiver modul (i.e., in the FPGA-bas d Clock Card
in Figure 5) de-multiplexes the data nd the time inf rmation
from the payload of the received frame. It also verifie that
the received fram ’s CRC-8 value matches the calculated one.
If no rr rs were detected then it feeds th time information
into a module th performs tim synchr nization executing
the pse do- ode in algorithm 1
The client clock module – as shown in Figur 7 – is
inc m ing a clock count r with an increment value –
corresponding to the nominal cl ck frequency in the intern l
time representation – in each sy em lock period. The clock
module fr quency can be adjust d through modi ying e time
inc ement i self. The Delaystatic constant can be measur d for
a given configurati n and adjusted accordingly. T e algorithm
is illustrated by a sample waveform of the mast r and slave
entity in Figure 8. Informally if the sk w is less than the
desired precision under one synch onization period – i.e. when
Algorithm 1 Receiver local time disciple algorithm
Increment ⇐ 1/fclk
Delay tatic ⇐ x
Tlo al ⇐ 0
for Each rising edge of system clock do
if Received valid time-stamp from Master then
Tlocal ⇐ Trecv + Increment+Delaystatic
else
Tlocal ⇐ Tlocal + Increment
end if
end for
the valid time is transmitted by the master entity – then the
phase of time progresses in sync on the two entities.
Fig. 8. Illustration of timing on the internal synchronization bus
To concede that th s system can hav nanosecond synchro-
nizatio let us execute the algorithm: let Tn be the nth time
point where synchr nization occur between and
the slave entity. At tim poi t Tn when the cli nt rec ives a
time-stamp the local clock will be in sync with the master
lock – given th the Delaystatic constant was determined
c rrectly. De-synchronization arises due to errors in the master
and client clock o cil ator frequency. To ensure that the desired
level of ynchronization s reached it has to be shown that
he m ster and client l k would ot diverge more than
one nano econds under time interval (Tn, Tn+1) – since by
definition at Tn+1 the c ocks will be in sync again and
this process is p riodic. Given a wors case calculation the
Fig. 7. Internal clock odule diagra
fra e starts it a all ’s rea le, a it is f ll e a
start it it al e . e t e el is se t iffere tiate
t e a l a t es. e it i icates t at t e a l a is
ti e i f r ati t er ise it is ata e ce a erla ata
c icati r t c l ca e se t is ata c a el. e
ti e f r at is i li e it t e e ter al ti e s c r izati ,
i.e., it ses t e ti e f r at f r re rese ti t e ti e
i f r ati . r etecti tra s iss err rs t e s, a
- al e is calc late f r t e ‘ y e’ a ‘ yl ’ el s
a a e e t t e fra e t at is c ec e fra e rece ti
f r etecti tra s issi err rs.
fra e bits bit ( )
a
bit sig alli g ( )
ere bit is t e it ti e, fsignalling is t e si alli fre-
e c t e i ter al s, a bits is t e er f its
i t e fra e. alc lati ( ) a ( ) it t e a e i e
ara eters, t e fra e ti e is / . z . s. i ce
e er t er fra e carries ti e i f r ati , t e cl c
t e recei er si e ca e isci li e /c tr lle a . s
asis. er s c s rt eri f ti e e e l alit , -
te erat re c tr lle cr stal scillat rs a e e li i le rift.
s a res lt, t is ate eri is a e ate f r t e et r
it ri se case.
e recei er le (i.e., i t e - ase l c ar
i i re ) e- lti le es t e ata a t e ti e i f r ati
fr t e a l a f t e recei e fra e. It als eri es t at
t e recei e fra e’s - al e atc es t e calc late e.
If err rs ere etecte t e it fee s t e ti e i f r ati
i t a le t at erf r s ti e s c r izati e ec ti
t e se -c e i al rit
e clie t cl c le as s i i re is
i cre e ti a cl c c ter it a i cre e t al e
c rres i t t e i al cl c fre e c e er a
ti e re rese tati i ea s ste cl c er e c c
le fre e c ca e a j ste t e e
i cre e t itself. e l static c ta ca e ea e
a i e c rati a a j ste acc e a
is il strate a sa le f r
e tit i i re . I f r ll if t
esire recisi er e s r i ti
l rit ecei er l cal ti e isci le al rit
I cr t clk
l static
local
f r ac risi e e f s ste cl c
if ecei e ali ti e-sta fr aster t e
local recv I cr t l static
else
local local I cr t
e if
e f r
t e ali ti e is tra s itte t e aster e tit t e t e
ase f ti e r resses i s c t e t e tities.
Fig. 8. Illustration of ti ing on the internal synchronization bus
c ce e t at t is s ste ca a e a sec s c r -
izati let s e ec te t e al rit : let e t e th ti e
i t ere s c r izati cc rs et ee t e aster a
t e sla e e tit . t ti e i t e t e clie t recei es a
ti e-sta t e l cal cl c ill i s c it t e aster
c c e a e l static c sta t as eter i e
c rrec e-s c r izati arises e t err rs i t e aster
a c e c c sc at fre e c . e s re t at t e esire
e e c za i i reac e it as t e s t at
e a e a c e t cl c l t i er e re t a
ti i t r al ( , 1) si ce
l s ill e i s c a ai a
i . i rst case calc lati t e
i . . t l l l i
it i lli
it
i . . ll t ti ti i t i t l i i
t
Fig. 7. Internal clock module diagram
frame starts with an all 1’s preamble, and it is followed by a
start bit with value 0. The type field is used to differentiate
the payload types. When T=1 it indicates that the payload is
time information otherwise it is data – hence an overlay data
commu ication prot col can be used on this data channel. The
time format is in line with the external time synchronization,
i.e., it uses the NTP time format for representing the time
information. For detecting transmission errors on the bus, a
CRC-8 value is calculated for the ‘Type’ and ‘Payl ad’ fields
and appended to the frame that is checked on frame r cept on
for detecting transmission errors.
Tframe = Nbits × Tbit (1)
and
Tbit = 1/fsignalling (2)
Wher Tbit is t e bit time, fsignalling is the signalling fr -
quency on the i ternal bus, and Nbits is the number of bits
in the frame. Calculating (1) and (2) with the above given
parameters, the frame time is 90/15.625MHz = 5.76 µs. Since
every other frame carries time information, the clock on
the receiver side can e disciplined/controlled o a 11.52 µs
basis. Under such short period of time even low quality, non-
temperature controlled crystal oscillators have negligible drift.
As a r sult, th update period s dequate for the network
mo itoring us cas .
T e receiver module (i.e., in the FPGA-based Clock Card
in Figure 5) de-multiplexes the data and the time information
from the paylo d of the received frame. It also verifies that
the received frame’s CRC-8 value matches the calculated one.
If no errors were detected then it feeds the time information
into a module that performs time synchronization exec ting
the pseudo-code in algorithm 1
The client clock module – as shown in Figure 7 – is
incrementing a clock counter with an increment value –
corresponding to the nominal clock frequency in the internal
time representation – in each system clock period. The clock
module frequency can be adjusted through modifying the time
increment itself. The Delaystatic constant can be measured for
a given configuration and adjusted accordingly. The algorithm
is illustrated by a sample waveform of the master and slave
entity in Figure 8. Inf rmally if the skew is less than the
desired precision under one synchroniz tion period – i.e. when




for Each rising edge of system clock do
if Received valid time-stamp from Master then
Tlocal ⇐ Trecv + Increment+Delaystatic
else
Tlocal ⇐ Tlocal + Increment
end if
end for
the valid time is transmitted by the master entity – th n the
phase of time progresses i sync on the two entities.
Fig. 8. Illustration of timing on the internal synchronization bus
To concede that this system can have a osecond synchro-
nization let us execute t e algorithm: let Tn be the nth time
point where synchronization ccurs bet een the master and
the slave entity. At time point Tn when the client receives a
time-stamp he local cl ck will be i sy c with the master
clock – given that the Delaystatic constant was determined
correctly. De-synchronization arises du to errors in the master
and client clock oscillator frequen y. To ensure that the desired
lev l of synchronization is rea hed it has to be show that
the master and client cl ck w uld not diverge more than
o e nanoseconds under time interval (Tn, Tn+1) – sinc by
definiti n at Tn+1 the cl cks will be in sync ag in a d
this process is periodic. Given a worst case calculation the
Fig. 7. Internal clock module diagram
frame starts with an all 1’s preamble, and it is followed by a
start bit with value 0. The type field is used to differentiate
the payload types. When T=1 it indicates that the payload is
time information otherwise it is data – hence an overlay data
communication protocol can be used on this data channel. The
time format is in line with the external time synchronization,
i.e., it uses the NTP time format for representing the time
information. For detecting transmission errors on the bus, a
CRC-8 value is calculated for the ‘Type’ and ‘Payload’ fields
and appended to the frame that is checked on frame reception
for detecting transmission errors.
Tframe = Nbits × Tbit (1)
and
Tbit = 1/fsignalling (2)
Where Tbit is the bit time, fsignalling is the signalling fre-
quency on the internal bus, and Nbits is the number of bits
in the frame. Calculating (1) and (2) with the above given
parameters, the frame time is 90/15.625MHz = 5.76 µs. Since
every other frame carries time information, the clock on
the receiver side can be disciplined/controlled on a 11.52 µs
basis. Under such short period of time even low quality, non-
temperature controlled crystal oscillators have negligible drift.
As a result, this update period is adequate for the network
monitoring use case.
The receiver module (i.e., in the FPGA-based Clock Card
in Figure 5) de-multiplexes the data and the time information
from the payload of the received frame. It also verifies that
the received frame’s CRC-8 value matches the calculated one.
If no errors were detected then it feeds the time information
into a module that performs time synchronization executing
the pseudo-code in algorithm 1
The client clock module – as shown in Figure 7 – is
incrementing a clock counter with an increment value –
corresponding to the nominal clock frequency in the internal
time representation – in each system clock period. The clock
module frequency can be adjusted through modifying the time
increment itself. The Delaystatic constant can be measured for
a given configuration and adjusted accordingly. The algorithm
is illustrated by a sample waveform of the master and slave
entity in Figure 8. Informally if the skew is less than the
desired precision under one synchronization period – i.e. when
Algorithm 1 Receiver local time disciple algorithm
Incr m nt ⇐ 1/fclk
D laystatic ⇐ x
T ocal ⇐ 0
for Each rising edge of system clock do
if Received valid tim - tamp from Mast then
Tlocal ⇐ Tre v + nc ement+Delaystatic
el e
Tlocal ⇐ Tlocal + Increm nt
end if
end for
the valid time is transmitted by the master entity – then the
phase of time progresses in sync on the two entities.
Fig. 8. Illustration of timing on the internal synchronization bus
To concede that this system can have nanosecond synchro-
nization let us execute the algorithm: let Tn be the nth time
point where synchronization occurs between the master and
the slave entity. At time point Tn when the client receives a
time-stamp the local clock will be in sync with the master
clock – given that the Delaystatic constant was determined
correctly. De-synchronization arises due to errors in the master
and client clock oscillator frequency. To ensure that the desired
level of synchronization is reached it has to be shown that
the master and client clock would not diverge more than
one nanoseconds under time interval (Tn, Tn+1) – since by
definition at Tn+1 the clocks will be in sync again and
this process is periodic. Given a worst case calculation the
Fig. 7. Inter al cl ck module diagram
frame starts with an all 1’s preamble, and it is followed by a
start bit with value 0. The type field is used to differentiate
the payload types. When T=1 it indicates that the payload is
time information otherwise it is data – hence an overlay data
co munication protocol can be used on this data channel. The
time format is in line with the ex ernal t me synchronization,
i.e., it uses the NTP time format for representing the time
information. For detecting transmission errors on the bus, a
CRC-8 value is calculated for the ‘Type’ and ‘Payload’ fields
and appended to the frame that is checked on frame reception
for detecting trans issio errors.
Tframe = Nbits × Tbit (1)
and
Tbit = 1/fsign lling (2)
Where Tbit is the bit time, fsignalling is the signalling fre-
quency on the internal bus, and Nbits is the number of bits
in the frame. Calculating (1) and (2) with the above given
parameters, the frame time is 90/15.625MHz = 5.76 µs. Since
every other frame carries time information, the clock on
the receiver side can be disciplined/controlled on a 11.52 µs
basis. Under such short period of time even low quality, non-
temperature controlled crystal oscillators have negligible drift.
As a result, this update period is adequate for the network
monitoring use case.
The receiver module (i.e., in the FPGA-based Clock Card
in Figure 5) de-multiplexes the data and the time information
from the payload of the received frame. It also verifies that
the received frame’s CRC-8 value matches the calculated one.
If no errors were detected then it feeds the time information
into a module that performs ti e synchronization executing
the pseudo-code in algorithm 1
The client clock module – as shown in Figure 7 – is
incrementing a clock counter with an increment value –
corresponding to the nominal clock frequency in the internal
time representation – in each system clock period. The clock
module frequency can be adjusted through modifying the time
increment itself. The Delaystatic constant can be measured for
a given configuration and adjusted accordingly. The algorithm
is illustrated by a sample waveform of the master and slave
entity in Figure 8. Informally if the skew is less than the
desired precision under one synchronization period – i.e. when




for Each rising edge of system clock do
if Received valid time-stamp from Master then
Tlocal ⇐ Trecv + Increment+Delaystatic
else
Tlocal ⇐ Tlocal + Increment
end if
end for
the valid time is transmitted by the master entity – then the
phase of time progresses in sync on the two entities.
Fig. 8. Illustration of timing on the internal synchronization bus
To concede that this system can have nanosecond synchro-
nization let us execute the algorithm: let T be the nth time
point where synchronization occurs between the master and
the slave entity. At time point Tn when the client receives a
time-stamp the local clock will be in sync with the master
clock – given that the Delaystatic constant was determined
correctly. De-synchronization arises due to errors in the master
and client clock oscillator frequency. To ensure that the desired
level of synchronization is reached it has to be shown that
the master and client clock would not diverge more than
one nanoseconds under time interval (Tn, Tn+1) – since by
definition at Tn+1 the clocks will be in sync again and
this process is periodic. Given a worst case calculation the
Fig. 7. Internal clock module diagram
frame starts with an all 1’s prea ble, and it is followed by a
st rt bit with value 0. The type field is used to differentiate
the payload types. hen T=1 it indicates that the payload is
ime inf rmation othe wis it is data – hence an overlay ata
communication pr tocol can be used on this data chan el. The
time form t is in line with the external ime syn ronization,
i.e., it uses the NTP t me format for repr senting the me
nformation. For detec ng transmission er ors o the bus, a
CRC-8 value is calculated for the ‘Type’ a d ‘Payl ad’ fields
and appended to the fr me that is checked on fr me reception
for detecting transmission errors.
Tframe = Nbits × Tbit (1)
and
Tbit = 1/fsignalling (2)
here Tbit is the bit time, fsignalling is the signalling fre-
quency on the in ernal bus, and Nbits is the number of bits
in the frame. Calculating (1) and (2) wi the a ove given
parameters, the frame me is 90/15.625 Hz = 5.76 µs. S nce
every other frame carries time information, the clock on
th receiv side can be disciplined/controlled on a 11.52 µs
basis. Under such short period of time even low qu lity, non-
temperatur controlled crystal oscillators have negligible drift.
As a es lt, this update period is dequate for the n twork
monitoring use case.
The receiver module (i.e., in the FPGA-based Clock Card
in Figu 5) de- ultipl xes the data and the time information
from th payload of th received fr me. It also verifies hat
the received frame’s CRC-8 value m tches he calculated one.
If no errors were detected then it feeds the time information
into a m dule that p rforms time synchronization execu ng
the pseud -code in algorithm 1
The client clock dule – as shown in Figure 7 – is
increment ng a clock counter with an increment value –
cor spo ding to the nominal clock freque y in the inter al
time represe ta ion – in each system clock period. The clock
modul f quency can be djusted through mod fying the time
increm nt itself. The Delaystatic constant can be measured for
a given configuration and adjusted accordingly. The lgorithm
is illustrated by sample waveform of the master nd slave
entity in Figure 8. Informally if the skew is l ss than th
desired prec sion under ne synchronization period – i.e. when




for Each rising edge of system clock o
if Received valid time-stamp fr m Master then
Tlocal Trecv + Incre ent+Delaystatic
else
Tlocal Tlocal + Increment
end if
end for
the valid time is transmitted by the master entity – then the
phase of ti e progres es in sync on the wo entities.
Fig. 8. Illustration of timing on the internal synchronization bus
To concede that this system can have nanosecond synchro-
nization let us execute the algorithm: let T be the nth time
point where synchro ization occurs between the aster and
the slave entit . At time p int Tn wh n th cli nt rec ives a
ime-stamp he local clock will be in sync w t the ma ter
clock – given that the Delaystatic con tant as deter in d
orrectly. D -synchronization arises due to errors in th aster
a d client clock oscillat r frequency. To ensu e that desir d
level of synchronization is reached it ha to be s ow that
the master and client clock would not diverg m re n
one nanoseco s under time interval (Tn, Tn+1) – since by
definition at Tn+1 the clocks will be in sync again and
this pr cess is periodic. Given a worst case calculat the























In equation 3 ε stands for the precision of the oscillator, fclk
is the system clock frequency and Tts is the time-stamp frame
time on the internal synchronization bus. In theory equation 3
can be satisfied for arbitrary ε if Tts can be adjusted freely.
Substituting parameters from the concrete implementation
with Tts = 2Tframe = 11.52 µs and ε = 50 ppm results in
1.15ns  1ns which is approximately satisfying.
It is important to note that this accuracy and precision is
only achieved over the internal time synchronization bus. If
the external subsystem – that is completely orthogonal to
the internal subsystem – synchronizes to its reference with
µs accuracy then this results in the same accuracy for the
monitoring probe vs. external reference relation. Even though
the synchronicity will be still at the ns level in the monitoring
probe vs. monitoring probe relation inside the same monitoring
node driven by the same master.
D. Implementation
The realized system with all internal components is shown
by Figure 9, where the external time synchronization – as
presented in Section IV-B – is done by the SGA Clock card
– visible in the bottom right part. Similarly, the internal time
synchronization – described in Section IV-C – is performed
over the local bus with agent modules. These modules run in
all the FPGA-based monitoring cards acting as slaves at the
high-speed time-stamp interfaces; all are driven by the SGA
Clock card acting as a master.
Fig. 9. The realized system with all internal components
V. VERIFICATION & RESULTS
There has been extensive testing and measurements carried
out for verifying the solution. In order to analyze the degree
of synchronization to the master NTP clock, a packet capturer
was installed on the Ethernet segment at which the FPGA
implementation of the NTP slave was connected. The NTP
packets used for synchronization were captured bidirectionally.
This packet capture then was filtered for those NTP packets
that had all 4 timestamps used in the On-Wire protocol to cal-
culate the offset from the reference clock value. Our dedicated
post-processing utility then extracted the offset information
along with the time elapsed from the start of measurement –
which is determined by the first NTP packet present in the
packet capture.
A sample packet capture is shown by Figure 10. The
statistical parameters – like the clock drift and real offset –
of the device was determined by fitting a linear curve on the
offset values. There were various measurements carried out –
for the actual measurement presented in this paper, the capture
was taken for approximately 3 hours. The results and the fitted
curve plot can be seen on Figure 11.
The curve fitted on this measurement shows that there was
a fix 14.52 µs offset compared to the reference clock. As
presented in Section III having a precise and stable clock –
with known offset – is as good as having an accurate one. Be-
sides, the first order stability of the device is −0.7 1/ns. This
precision and stability are considered adequate for satisfying
the requirements of the external time synchronization part.
As for the internal synchronization part, it is by design
has accuracy and precision in the magnitude of 1 ns – see
Section IV-C for details.























In equation 3 ε stands for the precision of the oscillator, fclk
is the system clock frequency and Tts is the time-stamp frame
time on the internal synchronization bus. In theory equation 3
can be satisfied for arbitrary ε if Tts can be adjusted freely.
Substituting parameters from the concrete implementation
with Tts = 2Tframe = 11.52 µs and ε = 50 ppm results in
1.15ns  1ns which is approximately satisfying.
It is important to note that this accuracy and precision is
only achieved over the internal time synchronization bus. If
the external subsystem – that is completely orthogonal to
the internal subsystem – synchronizes to its reference with
µs accuracy then this results in the same accuracy for the
monitoring probe vs. external reference relation. Even though
the synchronicity will be still at the ns level in the monitoring
probe vs. monitoring probe relation inside the same monitoring
node driven by the same master.
D. Implementation
The realized system with all internal components is shown
by Figure 9, where the external time synchronization – as
presented in Section IV-B – is done by the SGA Clock card
– visible in the bottom right part. Similarly, the internal time
synchronization – described in Section IV-C – is performed
over the local bus with agent modules. These modules run in
all the FPGA-based monitoring cards acting as slaves at the
high-speed time-stamp interfaces; all are driven by the SGA
Clock card acting as a master.
Fig. 9. The realized system with all internal components
V. VERIFICATION & RESULTS
There has been extensive testing and measurements carried
out for verifying the solution. In order to analyze the degree
of synchronization to the master NTP clock, a packet capturer
was installed on the Ethernet segment at which the FPGA
implementation of the NTP slave was connected. The NTP
packets used for synchronization were captured bidirectionally.
This packet capture then was filtered for those NTP packets
that had all 4 timestamps used in the On-Wire protocol to cal-
culate the offset from the reference clock value. Our dedicated
post-processing utility then extracted the offset information
along with the time elapsed from the start of measurement –
which is determined by the first NTP packet present in the
packet capture.
A sample packet capture is shown by Figure 10. The
statistical parameters – like the clock drift and real offset –
of the device was determined by fitting a linear curve on the
offset values. There were various measurements carried out –
for the actual measurement presented in this paper, the capture
was taken for approximately 3 hours. The results and the fitted
curve plot can be seen on Figure 11.
The curve fitted on this measurement shows that there was
a fix 14.52 µs offset compared to the reference clock. As
presented in Section III having a precise and stable clock –
with known offset – is as good as having an accurate one. Be-
sides, the first order stability of the device is −0.7 1/ns. This
precision and stability are considered adequate for satisfying
the requirements of the external time synchronization part.
As for the internal synchronization part, it is by design
has accuracy and precision in the magnitude of 1 ns – see
Section IV-C for details.























In equation 3 ε stands for the precision of the oscillator, fclk
is the system clock frequency and Tts is the time-stamp frame
time on the internal synchronization bus. In theory equation 3
can be satisfied for arbitrary ε if Tts can be adjusted freely.
Substituting parameters from the concrete implementation
with Tts = 2Tframe = 11.52 µs and ε = 50 ppm results in
1.15ns  1ns which is approximately satisfying.
It is important to note that this accuracy and precision is
only achieved over the internal time synchronization bus. If
the external subsystem – that is completely orthogonal to
the internal subsystem – synchronizes to its reference with
µs accuracy then this results in the same accuracy for the
monitoring probe vs. external reference relation. Even though
the synchronicity will be still at the ns level in the monitoring
probe vs. monitoring probe relation inside the same monitoring
node driven by the same master.
D. Implementation
The realized system with all internal components is shown
by Figure 9, where the external time synchronization – as
presented in Section IV-B – is done by the SGA Clock card
– visible in the bottom right part. Similarly, the internal time
synchronization – described in Section IV-C – is performed
over the local bus with agent modules. These modules run in
all the FPGA-based monitoring cards acting as slaves at the
high-speed time-stamp interfaces; all are driven by the SGA
Clock card acting as a master.
Fig. 9. The realized system with all internal components
V. VERIFICATIO & RESULTS
There has been extensive testing and measurements carried
out for verifying the solution. In order to analyze the degree
of synchronization to the master NTP clock, a packet capturer
was installed on the Ethernet segment at which the FPGA
implementation of the NTP slave was connected. The NTP
packets used for synchronization were captured bidirectionally.
This packet capture then was filtered for those NTP packets
that had all 4 timestamps used in the On-Wire protocol to cal-
culate the offset from the reference clock value. Our dedicated
post-processing utility then extracted the offset information
along with the time elapsed from the start of measurement –
which is determined by the first NTP packet present in the
packet capture.
A sample packet capture is shown by Figure 10. The
statistical parameters – like the clock drift and real offset –
of the device was determined by fitting a linear curve on the
offset values. There were various measurements carried out –
for the actual measurement presented in this paper, the capture
was taken for approximately 3 hours. The results and the fitted
curve plot can be seen on Figure 11.
The curve fitted on this measurement shows that there was
a fix 14.52 µs offset compared to the reference clock. As
presented in Section III having a precise and stable clock –
with known offset – is as good as having an accurate one. Be-
sides, the first order stability of the device is −0.7 1/ns. This
precision and stability are considered adequate for satisfying
the requirements of the external time synchronization part.
As for the internal synchronization part, it is by design
has accuracy and precision in the magnitude of 1 ns – see
Section IV-C for details.























In equation 3 ε stands for the precision of the oscillat r, fclk
is the system clock frequency and Tts is he ime-stamp frame
time on the inter al synchronization bus. In theory equation 3
can be satisfied for arbitrary ε if Tts can be adjusted freely.
Sub tituting parameters from t concr te implementat on
with Tts = 2Tframe = 11.52 µs and ε = 50 ppm results in
1.15ns  1ns w ich is approximately satisfying.
It is important to note that this accuracy and precision is
only achieved over the internal time synchronization bus. If
the external subsystem – that is completely orthogonal to
the internal subsystem – synchronizes to its reference with
µs accuracy then this results in the same accuracy for the
monitoring probe vs. external reference relation. Even though
the synchronicity will be still at the ns level in the monitoring
probe vs. monitoring probe relation inside the same monitoring
node driven by the same master.
D. Implementation
The realized system with all internal components is shown
by Figure 9, where the external time synchronization – as
presented in Section IV-B – is done by the SGA Clock card
– visible in the bottom right part. Similarly, the internal time
synchronization – described in Section IV-C – is performed
over the local bus with agent modules. These modules run in
all the FPGA-based monitoring cards acting as slaves at the
high-speed time-stamp interfaces; all are driven by the SGA
Clock card acting as a master.
Fig. 9. The realized system with all internal components
V. VERIFICATION & RESULTS
There has been xtensive testing and measurements carried
out for verifying the solution. In order to analyz the degre
of synchronization to the master NTP clock, a packet capturer
was installed on the Ethernet segment at which the FPGA
implementation of the NTP slave was connected. The NTP
pack ts used for synchronization were captured bidirecti ally.
This packet captur then was filtered for those NTP packets
that ad all 4 timestamps used in the On-Wire rotocol to cal-
culate the offset fr m the reference clock val e. Our dedicat
post-processi g utility then extracted the offset information
along with the time elapsed from the start of me surement –
which is determined by the first NTP packet pr sent in the
acket capture.
A sample packet c pture is shown by Figure 10. The
tati tical paramet rs – like the clock drift and real offset –
of the device w s determined by fitting a linear curve on the
offset values. There w r various a urements carried out –
for the actual measurement presented i this paper, the capture
was taken for approximately 3 hours. The results and the fitt d
curve plot can be seen on Figure 11.
The curve fitted on this measurement shows that there was
a fix 14.52 µs offset compared to the reference clock. As
presented in Section III having a precise and stable clock –
with known offset – is as good as having an accurate one. Be-
sides, the first order stability of the device is −0.7 1/ns. This
precision and stability are considered adequate for satisfying
the requirements of the external time synchronization part.
As for the internal synchronization part, it is by design
has accuracy and precision in the magnitude of 1 ns – see
Section IV-C for details.





















In equation 3 ε stands for the precision of the oscillator, fclk
is the system clock frequency and Tts is the time-stamp frame
time on the internal synchronization bus. In theory equation 3
can be satisfied for arbitrary ε if Tts can be adjusted freely.
Substituting parameters from the concrete implementation
with Tts = 2Tframe = 11.52 µs and ε = 50 ppm results in
1.15ns  1ns which is approximately satisfying.
It is important to note that this accuracy and precision is
only achieved over the internal time synchronization bus. If
the external subsystem – that is completely orthogonal to
the internal subsystem – synchronizes to its reference with
µs accuracy then this results in the same accuracy for the
monitoring probe vs. external reference relation. Even though
the synchronicity will be still at the ns level in the monitoring
probe vs. monitoring probe relation inside the same monitoring
node driven by the same master.
D. Implementation
The realized system with all internal components is shown
by Figure 9, where the external time synchronization – as
presented in Section IV-B – is done by the SGA Clock card
– visible in the bottom right part. Similarly, the internal time
synchronization – described in Section IV-C – is performed
over the local bus with agent modules. These modules run in
all the FPGA-based monitoring cards acting as slaves at the
high-speed time-stamp interfaces; all are driven by the SGA
Clock card acting as a master.
Fig. 9. The realized system with all internal components
V. VERIFICATION & RESULTS
There has been extensive testing and measurements carried
out for verifying the solution. In order to analyze the degree
of synchronization to the master NTP clock, a packet capturer
was installed on the Ethernet segment at which the FPGA
implementation of the NTP slave was connected. The NTP
packets used for synchronization were captured bidirectionally.
This packet capture then was filtered for those NTP packets
that had all 4 timestamps used in the On-Wire protocol to cal-
culate the offset from the reference clock value. Our dedicated
post-processing utility then extracted the offset information
along with the time elapsed from the start of measurement –
which is determined by the first NTP packet present in the
packet capture.
A sample packet capture is shown by Figure 10. The
statistical parameters – like the clock drift and real offset –
of the device was determined by fitting a linear curve on the
offset values. There were various measurements carried out –
for the actual measurement presented in this paper, the capture
was taken for approximately 3 hours. The results and the fitted
curve plot can be seen on Figure 11.
The curve fitted on this measurement shows that there was
a fix 14.52 µs offset compared to the reference clock. As
presented in Section III having a precise and stable clock –
with known offset – is as good as having an accurate one. Be-
sides, the first order stability of the device is −0.7 1/ns. This
precision and stability are considered adequate for satisfying
the requirements of the external time synchronization part.
As for the internal synchronization part, it is by design
has accuracy and precision in the magnitude of 1 ns – see
Section IV-C for details.
Time Synchronization Solution for FPGA-based
Distributed Network Monitoring
INFOCOMMUNICATIONS JOURNAL
MARCH 2018 • VOLUME X • NUMBER 1 7
Fig. 7. Internal clock module diagram
frame starts with an all 1’s preamble, and it is followed by a
start bit with value 0. The type field is used to differentiate
the payload types. When T=1 it indicates that the payload is
time information otherwise it is data – hence an overlay data
communication protocol can be used on this data channel. The
time format is in line with the external time synchronization,
i.e., it uses the NTP time format for representing the time
information. For detecting transmission errors on the bus, a
CRC-8 value is calculated for the ‘Type’ and ‘Payload’ fields
and appended to the frame that is checked on frame reception
for detecting transmission errors.
Tframe = Nbits × Tbit (1)
and
Tbit = 1/fsignalling (2)
Where Tbit is the bit time, fsignalling is the signalling fre-
quency on the internal bus, and Nbits is the number of bits
in the frame. Calculating (1) and (2) with the above given
parameters, the frame time is 90/15.625MHz = 5.76 µs. Since
every other frame carries time information, the clock on
the receiver side can be disciplined/controlled on a 11.52 µs
basis. Under such short period of time even low quality, non-
temperature controlled crystal oscillators have negligible drift.
As a result, this update period is adequate for the network
monitoring use case.
The receiver module (i.e., in the FPGA-based Clock Card
in Figure 5) de-multiplexes the data and the time information
from the payload of the received frame. It also verifies that
the received frame’s CRC-8 value matches the calculated one.
If no errors were detected then it feeds the time information
into a module that performs time synchronization executing
the pseudo-code in algorithm 1
The client clock module – as shown in Figure 7 – is
incrementing a clock counter with an increment value –
corresponding to the nominal clock frequency in the internal
time representation – in each system clock period. The clock
module frequency can be adjusted through modifying the time
increment itself. The Delaystatic constant can be measured for
a given configuration and adjusted accordingly. The algorithm
is illustrated by a sample waveform of the master and slave
entity in Figure 8. Informally if the skew is less than the
desired precision under one synchronization period – i.e. when




for Each rising edge of system clock do
if Received valid time-stamp from Master then
Tlocal ⇐ Trecv + Increment+Delaystatic
else
Tlocal ⇐ Tlocal + Increment
end if
end for
the valid time is transmitted by the master entity – then the
phase of time progresses in sync on the two entities.
Fig. 8. Illustration of timing on the internal synchronization bus
To concede that this system can have nanosecond synchro-
nization let us execute the algorithm: let Tn be the nth time
point where synchronization occurs between the master and
the slave entity. At time point Tn when the client receives a
time-stamp the local clock will be in sync with the master
clock – given that the Delaystatic constant was determined
correctly. De-synchronization arises due to errors in the master
and client clock oscillator frequency. To ensure that the desired
level of synchronization is reached it has to be shown that
the master and client clock would not diverge more than
one nanoseconds under time interval (Tn, Tn+1) – since by
definition at Tn+1 the clocks will be in sync again and
this process is periodic. Given a worst case calculation the























In equation 3 ε stands for the precision of the oscillator, fclk
is the system clock frequency and Tts is the time-stamp frame
time on the internal synchronization bus. In theory equation 3
can be satisfied for arbitrary ε if Tts can be adjusted freely.
Substituting parameters from the concrete implementation
with Tts = 2Tframe = 11.52 µs and ε = 50 ppm results in
1.15ns  1ns which is approximately satisfying.
It is important to note that this accuracy and precision is
only achieved over the internal time synchronization bus. If
the external subsystem – that is completely orthogonal to
the internal subsystem – synchronizes to its reference with
µs accuracy then this results in the same accuracy for the
monitoring probe vs. external reference relation. Even though
the synchronicity will be still at the ns level in the monitoring
probe vs. monitoring probe relation inside the same monitoring
node driven by the same master.
D. Implementation
The realized system with all internal components is shown
by Figure 9, where the external time synchronization – as
presented in Section IV-B – is done by the SGA Clock card
– visible in the bottom right part. Similarly, the internal time
synchronization – described in Section IV-C – is performed
over the local bus with agent modules. These modules run in
all the FPGA-based monitoring cards acting as slaves at the
high-speed time-stamp interfaces; all are driven by the SGA
Clock card acting as a master.
Fig. 9. The realized system with all internal components
V. VERIFICATION & RESULTS
There has been extensive testing and measurements carried
out for verifying the solution. In order to analyze the degree
of synchronization to the master NTP clock, a packet capturer
was installed on the Ethernet segment at which the FPGA
implementation of the NTP slave was connected. The NTP
packets used for synchronization were captured bidirectionally.
This packet capture then was filtered for those NTP packets
that had all 4 timestamps used in the On-Wire protocol to cal-
culate the offset from the reference clock value. Our dedicated
post-processing utility then extracted the offset information
along with the time elapsed from the start of measurement –
which is determined by the first NTP packet present in the
packet capture.
A sample packet capture is shown by Figure 10. The
statistical parameters – like the clock drift and real offset –
of the device was determined by fitting a linear curve on the
offset values. There were various measurements carried out –
for the actual measurement presented in this paper, the capture
was taken for approximately 3 hours. The results and the fitted
curve plot can be seen on Figure 11.
The curve fitted on this measurement shows that there was
a fix 14.52 µs offset compared to the reference clock. As
presented in Section III having a precise and stable clock –
with known offset – is as good as having an accurate one. Be-
sides, the first order stability of the device is −0.7 1/ns. This
precision and stability are considered adequate for satisfying
the requirements of the external time synchronization part.
As for the internal synchronization part, it is by design
has accuracy and precision in the magnitude of 1 ns – see
Section IV-C for details.
Fig. 7. Internal clock module diagram
frame starts with an all 1’s preamble, and it is followed by a
start bit with value 0. The type field is used to differentiate
the payload types. When T=1 it indicates that the payload is
time information otherwise it is data – hence an overlay data
communication protocol can be used on this data channel. The
time format is in line with the external time synchronization,
i.e., it uses the NTP time format for representing the time
infor ation. For detecting trans ission errors on the bus, a
CRC-8 value is calculated for the ‘Type’ and ‘Payload’ fields
and appended to the fra e that is checked on fra e reception
for detecting trans ission errors.
fra e bits bit (1)
and
bit sig alli g ( )
here Tbit is the bit ti , fsignalling is t si lli fr -
quency on the internal s, bits i t r f it
in the fra e. alculati ( ) ( ) it t i
para eters, the fra e ti e is / . . . i
every other fra e carries ti i f ,
the receiver side can be is i li / t
basis. nder such short eri f ti
te perature controlled cr st l s ill t
As a result, this update eri i
monitoring use case.
The receiver odule (i.e., i t
in Figure 5) de- ultiple es t t
fro the payload of the re i fr .
the received fra e’s - l t
If no e rors ere detecte t it f
into a odule that perf r s ti
the pseudo-code in alg rit
The client clock l
incre enting a clock c t r it
co responding to the n i l l i t i t l
ti e representation – in eac s t l i . l
module frequency can be a j st t r if i t ti
incre ent itself. he el st tic st t s r f r
a given configuration a a j st r i l . l rit
is illustrated by a sa le a ef r f t e aster a sla e
entity in Figure 8. Infor ally if the ske is less than the
desired precision under one synchronization period – i.e. hen




for Each rising edge of system clock do
if Received valid time-stamp from aster then
Tlocal Trecv + Incre ent+ elaystatic
else
Tlocal Tlocal Incre ent
end if
end for
t e ali ti e is tra s itte t e aster entity – then the
s f ti r r ss s i s t e t o entities.
ti i t e i ternal synchronization bus
i t a e nanosecond synchro-
t t l rit : let n be the nth ti e
i ti rs et een the aster and
i t en the client receives a
l ill e i sync ith the aster
l i t t t l st tic c stant as deter ined
tl . i ti ris s e to errors in the aster
li t l ill t r fr . ensure that the desired
l l f s r i ti is r it has to be sho n that
t st r li t l l ot diverge ore than
e a sec s er ti e i ter al ( n, n 1) – since by
de nition at n 1 the clocks ill be in sync again and
this process is periodic. iven a orst case calculation the
Fig. 7. Internal clock module diagram
frame starts with an all 1’s preamble, and it is followed by a
start bit with value 0. The type field is used to differentiate
the payload types. When T=1 it indicates that the payload is
time information otherwise it is data – hence an overlay data
communication protocol can be used on this data channel. The
time format is in line with the external time synchronization,
i.e., it uses the NTP time format for representing the time
information. For detecting transmission errors on the bus, a
CRC-8 value is calculated for the ‘Type’ and ‘Payl ad’ fields
and appended to the frame that is checked on frame r ception
for detecting transmission errors.
Tframe = Nbits × Tbit (1)
and
Tbit = 1/fsignalling (2)
Wher Tbit is t e bit time, fsignalling is the signalling fre-
quency on the i ternal bus, and Nbits is the number of bits
in the frame. Calculating (1) and (2) with the above given
parameters, the frame time is 90/15.625MHz = 5.76 µs. Since
every other frame carries time information, the clock on
the receiver side can be disciplined/controlled on a 11.52 µs
basis. Under such short period of time even low quality, non-
temperature controlled crystal oscillators have negligible drift.
As a r sult, th update period s dequate for the network
mo itoring us cas .
T e receiver module (i.e., in the FPGA-based Clock Card
in Figure 5) de-multiplexes the data and the time information
from the paylo d of the received frame. It also verifies that
the received frame’s CRC-8 value matches the calculated one.
If no errors were detected then it feeds the time information
into a module that performs time synchronization executing
the pseudo-code in algorithm 1
The client clock module – as shown in Figure 7 – is
incrementing a clock counter with an increment value –
corresponding to the nominal clock frequency in the internal
time representation – in each system clock period. The clock
module frequency can be adjusted through modifying the time
increment itself. The Delaystatic constant can be measured for
a given configuration and adjusted accordingly. The algorithm
is illustrated by a sample waveform of the master and slave
entity in Figure 8. Inf rmally if the skew is less than the
desired precision under one synchroniz tion period – i.e. when




for Each rising edge of system clock do
if Received valid time-stamp from Master then
Tlocal ⇐ Trecv + Increment+Delaystatic
else
Tlocal ⇐ Tlocal + Increment
end if
end for
the valid time is transmitted by the master entity – then the
phase of time progresses i sync on the two entities.
Fig. 8. Illustration of timing on the internal synchronization bus
To concede that this system can have nanosecond synchro-
nization let us execute the algorithm: let Tn be the nth time
point where synchronization occurs between the master and
the slave entity. At time point Tn when the client receives a
time-stamp he local cl ck will be i sy c with the master
clock – given that the Delaystatic constant was determined
correctly. De-synchronization arises due to errors in the master
and client clock oscillator frequency. To ensure that the desired
level of synchronization is rea hed it has to be show that
the master and client clock would not diverge more than
one nanoseconds under time interval (Tn, Tn+1) – since by
definition at Tn+1 the cl cks will be in sync again and
this process is periodic. Given a worst case calculation the























In equation 3 ε stands for the precision of the oscillator, fclk
is the system clock frequency and Tts is the time-stamp frame
tim on the intern l synchronization bus. In the ry equation 3
can be satisfied for arbitrar ε if can be adjusted freely.
Substituting pa ameters from he concrete implemen ation
with Tts = 2Tframe = 11.52 µs and ε = 50 ppm results in
1.15ns  1ns which is approximately satisfying.
It i important to note that this accuracy a d precision is
only achieved over the internal time synchronization bus. If
the external subsystem – that is completely orthogonal to
the internal subsystem – sy chronize to its refere ce with
µs accuracy then this results in the same accu acy for the
mon tori g pro e vs. external reference relation. Ev thoug
the synchronicity will be ti l at the ns level in the monitoring
pr be vs. monitoring probe re ation i side the same monit ring
node driven by the sam master.
D. Implementation
The realized system with all internal components is shown
by Figure 9, where the external time synchronization – as
presented in Section IV-B – is done by the SGA Clock card
– vis ble in the bottom righ part. Si ilarly, the internal time
synchronization – described n Section IV-C – is performe
over the local bus with a ent modules. These modules run in
all t e FPGA-base monitoring cards acting as slav s at the
high-sp ed time- tamp int rfaces; all are driven by the SGA
C ock card acting as a master.
Fig. 9. The realized system with all internal components
V. VERIFICATION & RESULTS
There has been extensive testing and measurements carried
out for verifying the solution. In order to analyze the degree
of synchronization to the master NTP clock, a packe capturer
was installed on the Ethernet segm nt at which FPGA
implementation of the NTP slave was connected. The NTP
packets used for synchronization were captured bidirectionally.
This packet capture then was filt red for those NTP packets
that had all 4 timestamps used in the On-Wire protocol to cal-
culate the offset from the referenc clock value. Our dedi ated
post-processing utility then extract d the offset inf rmation
along with the time elapsed from the start of measur men –
which is determined by the first NTP packet present in the
packet captur .
A sample pack t cap ure is shown by Figure 10. T
statistical pa ameters – like the clock drift and real offset –
of the device was determined by fitting a linear curve on t
offse v ues. Th re were various measurements carried ou
for the actual measurement presented in this paper, the captur
wa taken for approximately 3 hours. The resul and the fitted
cu ve plo can be seen o Figure 11.
The curve fitted on this measurement shows that there was
a fix 14.52 µs off t c mpared to the reference clock. As
presented in Section III having a pr cise and stable clock –
with known offset – is as good as having an accurate ne Be-
sides, he first order stability of the device is −0.7 1/ns. This
precision and stability re c nsidered adequ te for satisfying
the requirements of the ex ernal time synchronization part.
As for the internal syn hronization part, it is by des gn
has accu acy and precision in the mag itude of 1 s – see
Section IV-C for det ils.























In equation 3 ε stands for t e precision of the oscillat r, fclk
is the system cl ck frequency nd Tts is he ime-s amp frame
ime on the inter al synchronizat on bus. In theory equation 3
can be satisfied for arbitrar ε if Tts can be adjusted freely.
Sub tit ting parameters from t concr te implementat on
with Tts = 2Tframe = 11.52 µs and ε = 50 ppm results in
1.15ns  1 s w ich is approximately satisfying.
It is important to note that this accuracy and precision is
only achieved over the internal time synchronization bus. If
the external subsystem – that is completely orthogonal to
the internal subsystem – synchronizes to its reference with
µs accuracy then this results in the same accuracy for the
monitoring probe vs. external reference relation. Even though
the synchronicity will be still at the ns level in the monitoring
probe vs. monitoring probe relation inside the same monitoring
node driven by the same master.
D. Implementation
The realized system with all internal components is shown
by Figure 9, where the external time synchronization – as
presented in Section IV-B – is done by the SGA Clock card
– visible in the bottom right part. Similarly, the internal time
synchronization – described in Section IV-C – is performed
over the local bus with agent modules. These modules run in
all the FPGA-based monitoring cards acting as slaves at the
high-speed time-stamp interfaces; all are driven by the SGA
Clock card acting as a master.
Fig. 9. The realized system with all internal components
V. VERIFICATION & RESULTS
There has b en xtensiv testing and measurements carri
out fo v rifying the solution. In order to analyz the degre
of synchronization to the master NTP clock, a packet capturer
as install d on the Ethernet egment t which the FPGA
impl mentation of the NTP slave was connected. The NTP
pack ts used for synchronization were captured bidirecti ally.
This packet c ptur then was filtered for those NTP packets
that a all 4 timestamps used in the On-Wire rotocol to cal-
culate the offset fr m the reference clock val e. Our d icat
p st-pro essi g utility then extracted the offset information
along with the time elapsed from the start of me surement –
which is determined by the first NTP packet pr sent in the
acket capture.
A sample packet c pture is shown by Figure 10. The
tati tical paramet rs – like the clock drift and real offset
of the device w s determined by fitting a linear curve on the
offset values. There w r vari us a ur ments carried out –
for the actual measurement pre ented i this paper, the capture
was taken for approximately 3 hours. The results nd the fitt d
curve plot can be seen on Figure 11.
The curve fitted on this measur ment shows that there was
a fix 14.52 µs offset compared to the reference clock. As
presented in Section III having a precise and stable clock –
with known offset – is as good as having an accurate one. Be-
sides, the first order stability of the device is −0.7 1/ns. This
precision and stability are considered adequate for satisfying
the requirements of the external time synchronization part.
As for the internal synchronization part, it is by design
has accuracy and precision in the magnitude of 1 ns – see
Section IV-C for details.





















In equation 3 ε stands for the precision of the oscillator, fclk
is the system clock frequency and Tts is the time-stamp frame
time on the internal synchronization bus. In theory equation 3
can be satisfied for arbitrary ε if Tts can be adjusted freely.
Substituting parameters from the concrete implementation
with Tts = 2Tframe = 11.52 µs and ε = 50 ppm results in
1.15ns  1ns which is approximately satisfying.
It is important to note that this accuracy and precision is
only achieved over the internal time synchronization bus. If
the external subsystem – that is completely orthogonal to
the internal subsystem – synchronizes to its reference with
µs accuracy then this results in the same accuracy for the
monitoring probe vs. external reference relation. Even though
the synchronicity will be still at the ns level in the monitoring
probe vs. monitoring probe relation inside the same monitoring
node driven by the same master.
D. Implementation
The realized system with all internal components is shown
by Figure 9, where the external time synchronization – as
presented in Section IV-B – is done by the SGA Clock card
– visible in the bottom right part. Similarly, the internal time
synchronization – described in Section IV-C – is performed
over the local bus with agent modules. These modules run in
all the FPGA-based monitoring cards acting as slaves at the
high-speed time-stamp interfaces; all are driven by the SGA
Clock card acting as a master.
Fig. 9. The realized system with all internal components
V. VERIFICATION & RESULTS
There has been extensive testing and measurements carried
out for verifying the solution. In order to analyze the degree
of synchronization to the master NTP clock, a packet capturer
was installed on the Ethernet segment at which the FPGA
implementation of the NTP slave was connected. The NTP
packets used for synchronization were captured bidirectionally.
This packet capture then was filtered for those NTP packets
that had all 4 timestamps used in the On-Wire protocol to cal-
culate the offset from the reference clock value. Our dedicated
post-processing utility then extracted the offset information
along with the time elapsed from the start of measurement –
which is determined by the first NTP packet present in the
packet capture.
A sample packet capture is shown by Figure 10. The
statistical parameters – like the clock drift and real offset –
of the device was determined by fitting a linear curve on the
offset values. There were various measurements carried out –
for the actual measurement presented in this paper, the capture
was taken for approximately 3 hours. The results and the fitted
curve plot can be seen on Figure 11.
The curve fitted on this measurement shows that there was
a fix 14.52 µs offset compared to the reference clock. As
presented in Section III having a precise and stable clock –
with known offset – is as good as having an accurate one. Be-
sides, the first order stability of the device is −0.7 1/ns. This
precision and stability are considered adequate for satisfying
the requirements of the external time synchronization part.
As for the internal synchronization part, it is by design
has accuracy and precision in the magnitude of 1 ns – see























In equation 3 ε sta ds for the precisio of the oscillator, fclk
is the system clock frequency and Tts is the time-stamp frame
time on the internal synchronization bus. In theory equation 3
can be satisfie for arbitrary ε if Tts can be adjusted freely.
Substituting parameters from the concrete implementation
with Tts = 2Tframe = 11.52 µs and ε = 50 ppm results in
1.15ns  1ns which is approximately satisfying.
It is important to note that this accuracy and precision is
only achieved over the internal time synchronization bus. If
the external subsystem – that is completely orthogonal to
the internal subsystem – synchronizes to its reference with
µs accuracy then this results in the same accuracy for the
monitoring probe vs. external reference relation. Even though
the synchronicity will be still at the ns level in the monitoring
probe vs. monitoring probe relation inside the same monitoring
node driven by the same master.
D. Implementation
The realized system with all internal components is shown
by Figure 9, where the external time synchronization – as
presented in Section IV-B – is done by the SGA Clock card
– visible in the bottom right part. Similarly, the internal time
synchronization – described in Section IV-C – is performed
over the local bus with agent modules. These modules run in
all the FPGA-based monitoring cards acting as slaves at the
high-speed time-stamp interfaces; all are driven by the SGA
Clock card acting as a master.
Fig. 9. The realized system with all internal components
V. VERIFICATION & RESULTS
There has been extensive testing and measurements carried
out for verifying the solution. In order to analyze the degree
of synchronization to the master NTP clock, a packet capturer
was installed on the Ethernet segment at which the FPGA
implementation of the NTP slave was connected. The NTP
packets used for synchronization were captured bidirectionally.
This packet capture then was filtered for those NTP packets
that had all 4 timestamps used in the On-Wire protocol to cal-
culate the offset from the reference clock value. Our dedicated
post-processing utility then extracted the offset information
along with the time elapsed from the start of measurement –
which is determined by the first NTP packet present in the
packet capture.
A sample packet capture is shown by Figure 10. The
statistical parameters – like the clock drift and real offset –
of the device was determined by fitting a linear curve on the
offset values. There were various measurements carried out –
for the actual measurement presented in this paper, the capture
was taken for approximately 3 hours. The results and the fitted
curve plot can be seen on Figure 11.
The curve fitted on this measurement shows that there was
a fix 14.52 µs offset compared to the reference clock. As
presented in Section III having a precise and stable clock –
with known offset – is as good as having an accurate one. Be-
sides, the first order stability of the device is −0.7 1/ns. This
precision and stability are considered adequate for satisfying
the requirements of the external time synchronization part.
As for the internal synchronization part, it is by design
has accuracy and precision in the magnitude of 1 ns – see
Section IV-C for details.























In equation 3 ε stands for the precision of the oscillator, fclk
is the system clock frequency and Tts is the time-stamp frame
time on the internal synchronization bus. In theory equation 3
can be satisfied for arbitrary ε if Tts can be adjusted freely.
Substituting parameters from the concrete implementation
with Tts = 2Tframe = 11.52 µs and ε = 50 ppm results in
1.15ns  1ns which is approximately satisfying.
It is important to note that this accuracy and precision is
only achieved over the internal time synchronization bus. If
the external subsystem – that is completely orthogonal to
the internal subsystem – synchronizes to its reference with
µs accuracy then this results in the same accuracy for the
monitoring probe vs. external reference relation. Even though
the synchronicity will be still at the ns level in the monitoring
probe vs. monitoring probe relation inside the same monitoring
node driven by the same master.
D. Implementation
The realized system with all internal components is shown
by Figure 9, where the external time synchronization – as
presented in Section IV-B – is done by the SGA Clock card
– visible in the bottom right part. Similarly, the internal time
synchronization – described in Section IV-C – is performed
over the local bus with agent modules. These modules run in
all the FPGA-based monitoring cards acting as slaves at the
high-speed time-stamp interfaces; all are driven by the SGA
Clock card acting as a master.
Fig. 9. The realized system with all internal components
V. VERIFICATION & RESULTS
There has been extensive testing and measurements carried
out for verifying the solution. In order to analyze the degree
of synchronization to the master NTP clock, a packet capturer
was installed on the Ethernet segment at which the FPGA
implementation of the NTP slave was connected. The NTP
packets used for synchronization were captured bidirectionally.
This packet capture then was filtered for those NTP packets
that had all 4 timestamps used in the On-Wire protocol to cal-
culate the offset from the reference clock value. Our dedicated
post-processing utility then extracted the offset information
along with the time elapsed from the start of measurement –
which is determined by the first NTP packet present in the
packet capture.
A sample packet capture is shown by Figure 10. The
statistical parameters – like the clock drift and real offset –
of the device was determined by fitting a linear curve on the
offset values. There were various measurements carried out –
for the actual measurement presented in this paper, the capture
was taken for approximately 3 hours. The results and the fitted
curve plot can be seen on Figure 11.
The curve fitted on this measurement shows that there was
a fix 14.52 µs offset compared to the reference clock. As
presented in Section III having a precise and stable clock –
with known offset – is as good as having an accurate one. Be-
sides, the first order stability of the device is −0.7 1/ns. This
precision and stability are considered adequate for satisfying
the requirements of the external time synchronization part.
As for the internal synchronization part, it is by design
has accuracy and precision in the magnitude of 1 ns – see
Section IV-C for details.
Fig. 10. Measurement result – a typic l example of an NTP packet containing four timestamps
Fig. 11. Measurement result – plot of derived offset-measurement data
The ∼1 s accuracy over the internal synchronization bus
atisfies the criteria of any monitoring system with 1, 10 or
ev n 100 Gbit/s Ethernet, since packet inter- rrival time even
of th latter case is 6.72ns [24] – al ost one magnitude greater
than the theoretical accuracy f the presented, implemented
and verified time synchronization method.
As a real-life verification, the above described time-
synchronization system has been put into operation at Magyar
Telekom. The system hardware (with its FPGA firmware) has
been installed beside the SGA-7N network monitoring system,
and showed the expected result. The system provides accurate
time information to the monitoring cards ever since, and it is
planned to be expanded for covering all related monitoring
cards, network-wide.
VI. CONCLUSION
In this paper, we introduced a general time synchronization
solution for a high performance, lossless network monitoring
system called SGA-7N that is based on a reconfigurable archi-
tecture. The probes of the system are called “Monitors”, which
consists of three main building blocks: a high performance
Field Programmable Gate Array (FPGA)-based custom hard-
ware platform, a firmware dedicated for network monitoring,
and the probe software. The reconfigurable property of the
FPGA chip enables to turn the Monitor hardware platform
into a high performance networking device – among others, a
network monitoring probe. Beside supporting distributed and
lossless packet level monitoring of Ethernet links for 1 or 10
Time Synchronization Solution for FPGA-based
Distributed Network Monitoring
MARCH 2018 • VOLUME X • NUMBER 18
INFOCOMMUNICATIONS JOURNAL
Fig. 10. Measurement result – a typical example of an NTP packet containing four timestamps
Fig. 11. Measurement result – plot of derived offset-measurement data
The ∼1 ns accuracy over the internal synchronization bus
satisfies the criteria of any monitoring system with 1, 10 or
even 100 Gbit/s Ethernet, since packet inter-arrival time even
of the latter case is 6.72ns [24] – almost one magnitude greater
than the theoretical accuracy of the presented, implemented
and verified time synchronization method.
As a real-life verification, the above described time-
synchronization system has been put into operation at Magyar
Telekom. The system hardware (with its FPGA firmware) has
been installed beside the SGA-7N network monitoring system,
and showed the expected result. The system provides accurate
time information to the monitoring cards ever since, and it is
planned to be expanded for covering all related monitoring
cards, network-wide.
VI. CONCLUSION
In this paper, we introduced a general time synchronization
solution for a high performance, lossless network monitoring
system called SGA-7N that is based on a reconfigurable archi-
tecture. The probes of the system are called “Monitors”, which
consists of three main building blocks: a high performance
Field Programmable Gate Array (FPGA)-based custom hard-
ware platform, a firmware dedicated for network monitoring,
and the probe software. The reconfigurable property of the
FPGA chip enables to turn the Monitor hardware platform
into a high performance networking device – among others, a
network monitoring probe. Beside supporting distributed and
lossless packet level monitoring of Ethernet links for 1 or 10
Fig. 10. Measurement result – a typical example of an NTP packet containing four timestamps
Fig. 11. Measurement result – plot of derived offset-measurement data
The ∼1 ns accuracy over the internal synchronization bus
satisfies the criteria of any monitoring system with 1, 10 or
even 100 Gbit/s Ethernet, since packet inter-arrival time even
of the latter case is 6.72ns [24] – almost one magnitude greater
than the theoretical accuracy of the presented, implemented
and verified time synchronization method.
As a real-life verification, the above described time-
synchronization system has been put into operation at Magyar
Telekom. The syste h dware (with its FPGA fir ware) has
been installed beside th SGA-7N netwo k monit ring system,
and showed the expected result. The system p ovid s accurate
time information to the monit ring cards ver since, and it is
planned to be expanded for cov ring all related monitori g
cards, network-wide.
VI. CONCLUSION
In this paper, we introduced a general time synchronization
solution for a high performance, lossless network monitoring
system called SGA-7N that is based on a reconfigurable archi-
tecture. The probes of the system are called “Monitors”, which
consists of three main building blocks: a high performance
Field Programmable Gate Array (FPGA)-based custom hard-
ware platform, a firmware dedicated for network monitoring,
and the probe software. The reconfigurable property of the
FPGA chip enables to turn the Monitor hardware platform
into a high performance networking device – among others, a
network monitoring probe. Beside supporting distributed and
lossless packet level monitoring of Ethernet links for 1 or 10
Fig. 10. Measurement result – a typical example of an NTP packet containing four timestamps
Fig. 11. Measurement result – plot of derived offset-measurement data
The ∼1 s accuracy over the internal synchronization bus
satisfies the criteria of any monitoring system with 1, 10 or
even 100 Gb t/s Ethernet, sinc packet inter-arrival time even
of the latter case is 6.72ns [24] – almost one magnitude greater
than the theoretical accuracy of the pres nted, implemented
and verifi d time synchronization method.
As a real-life verification, the above described time-
synchronizati n system h s been put into ope ation at Magyar
Tel kom. The system hardware (with its FPGA fi w re) has
been installed beside he SGA-7N network monitoring system,
and showed h expected result. The system provid s accurate
time format on to the monitoring cards ever sinc , and it is
planned to be exp nded for covering all related m nitoring
cards, network-wide.
VI. CONCLUSION
In this paper, we introduced a general time synchronization
solution for a high performance, lossless network monitoring
system called SGA-7N that is based on a reconfigurable archi-
tecture. The probes of the ystem are called “Monitors”, which
consists f three main building blocks: a high performance
Field Programmable Gate Array (FPGA)-based custom hard-
ware platform, a firmware dedicated for network monitoring,
and the probe software. The reconfigurable property of the
FPGA chip enables to turn he Monitor hardware platform
int a hi h perform nce netwo king device – among others, a
network monitoring probe. Beside supporting distributed and
lossless packet level monitoring of Ethernet links for 1 or 10
Fig. 10. Measurement result – a typical example of an NTP packet containing four timestamps
Fig. 11. Measurement result – plot of derived offset-measurement data
The ∼1 ns accuracy over the internal synchronization bus
satisfies the criteria of any monitoring system with 1, 10 or
even 100 Gbit/s Ethernet, since packet inter-arrival time even
of the latter case is 6.72ns [24] – al ost one magnitude greater
than the theoretical accuracy of the presented, implemented
and verified time synchronization method.
As a real-life verification, the above described time-
synchronization system has been put into operation at Magyar
Telekom. The system hardware (with its FPGA firmware) has
been installed beside the SGA-7N network monitoring system,
and showed the expected result. The system provides accurate
time information to the monitoring cards ever since, and it is
planned to be expanded for covering all related monitoring
cards, network-wide.
VI. CONCLUSION
In this paper, we introduced a general time synchronization
solution for a high performance, lossless network monitoring
system called SGA-7N that is based on a reconfigurable archi-
tecture. The probes of the system are called “Monitors”, which
consists of three ain building blocks: a high performance
Field Programmable Gate Array (FPGA)-based custom hard-
ware platform, a firmware dedicated for network monitoring,
and the probe software. The reconfigurable property of the
FPGA chip enables to turn the Monitor hardware platform
into a high performance networking device – among others, a
network monitoring probe. Beside supporting distributed and
lossless packet level monitoring of Ethernet links for 1 or 10
Gbps of the described system, the FPGA serves as the base
platform of the time synchronization solution for the interface
cards of the Monitors.
Time synchronization of the network monitoring nodes are
crucial, since the analysis depends highly on the proper mes-
sage sequence, which is determined mainly by the timestamps.
First, each monitoring site has to have a reference clock
that is synchronized with other reference clocks at other sites.
Naturally, the monitoring system has to be synchronized to
the reference clock available at the physical site. In this
paper we call it external time synchronization, and it is
solved by an FPGA-based, NTP implementation that use
data filtering and has a clock discipline module in order to
output monotonous clock information. This avoids timestamps
jumping backwards, or jumping forward too much within one
step, hence the clock if the monitoring system becomes well-
regulated. Each interface at the monitoring node has to get
synchronized with this clock information. In this paper we call
it internal time synchronization, and it is implemented through
a proprietary time-synchronization protocol. Its sender, (or
master) part works in a distributor-card residing at the main
reference clock machine of the monitoring system, whereas
the receiver (or slave) parts are realized within the FPGA of
the monitoring cards.
As presented in the paper, the overall system shows sub-
nanosecond accuracy and stability, meeting the requirements
of 10 Gbps, or even 100 Gbps Ethernet-based packet monitor-
ing. The presented solution is already installed in the network-
wide, real-life monitoring at Magyar Telekom.
REFERENCES
[1] D. L. Mills, Computer Network Time Synchronization: The Network Time
Protocol. Boca Raton, FL: CRC Press, 2006.
[2] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems (PTPv2), IEEE Standard
1588-2008.
[3] P. Tatai, G. Marosi, and L. Osvath, “A flexible approach to mo-
bile telephone traffic mass measurement and analysis,” in 18th IEEE
Instrumentation and Measurement Technology Conference, Budapest,
Hungary, May 2001.
[4] D. Kozma, G. Soos, and P. Varga, “Supporting LTE Network and Service
Management through Session Data Record Analysis,” Infocommunica-
tions Journal, vol. 71, no. 2, pp. 11–16, June 2016.
[5] AITIA, “SGA-NETMON The GSM/GPRS/UMTS/LTE Network
Monitoring System,” White Paper, 2012. [Online]. Available:
http://sga.aitia.ai/pdfs/SGA-NetMon.pdf
[6] J. W. Lockwood, N. McKeown, G. Watson, G. Gibb, P. Hartke, J. Naous,
R. Raghuraman, and J. Luo, “NetFPGA - An Open Platform for Gigabit-
rate Network Switching and Routing,” in IEEE MSE, San Diego, CA,
US, 2007.
[7] N. Possley, “Traffic Management in Xilinx FPGAs,” White
Paper, 2006. [Online]. Available: http://www.xilinx.com/support/
documentation/white papers/wp244.pdf
[8] W. Jiang and V. K. Prasanna, “Scalable Packet Classification on FPGA,”
IEEE Transactions on Very Large Scale Integration Systems, vol. 20,
no. 9, pp. 1668–1680, August 2011.
[9] P. Orosz, T. Tothfalusi, and P. Varga, “C-GEP: Adaptive Network Man-
agement with Reconfigurable Hardware,” in 14th IEEE International
Symposium on Integrated Management (IM), Ottawa, Canada, 2015.
[10] SGA-GPlanar product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
gplanar/
[11] SGA-10GED product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
10ged/
[12] V. Pus, L. Kekely, and J. Korenek, “Low-Latency Modular Packet
Header Parser for FPGA,” in ACM/IEEE Symposium on Architectures for
Networking and Communications Systems, Austin, TX, USA, October
2012.
[13] P. Olaszi, “Complex Load Testing of Mobile PS and CS
Core,” EuroNOG 2012, September 2012. [Online]. Avail-
able: http://www.data.proidea.org.pl/euronog/2edycja/materials/Peter
Olaszi-Complex Load Testing of Mobile PS and CS Core.pdf
[14] G. Soos and P. Varga, “Use Cases for LTE Core Network Mass Testing,”
in 5th Mesterproba, Budapest, Hungary, 2016.
[15] P. Varga, L. Kovacs, T. Tothfalusi, and P. Orosz, “C-GEP: 100 Gbit/s
Capable, FPGA-based, Reconfigurable Networking Equipment,” in 16th
IEEE International Conference on High Performance Switching and
Routing (HPSR), Budapest, Hungary, 2015.
[16] B. Sterzbach, “GPS-based Clock Synchronization in a Mobile, Dis-
tributed Real-Time System,” Real-Time Systems, vol. 12, no. 1, pp. 63—
-75, January 1997.
[17] J. J. Garnica, V. Moreno, I. Gonzalez, S. Lopez-Buedo, F. J. Gomez-
Arribas, and J. Aracil, “ARGOS: A GPS Time-Synchronized Network
Interface Card based on NetFPGA,” in 2nd North American NetFPGA
Developers Workshop, Stanford, CA, USA, August 2010.
[18] D. L. Mills, “Internet Time Synchronization: The Network Time Proto-
col,” IEEE Transactions on Communications, vol. 39, no. 10, pp. 1482–
1493, October 1991.
[19] IEEE Standard VHDL Language Reference Manual, IEEE Standard
1076-2008.
[20] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems, IEEE Standard 1588-
2002.
[21] F. N. Janky, “A Flexible Architecture for Protocol Implementations
within FPGAs,” in 25th Telecommunications Forum (TELFOR), Bel-
grade, Serbia, 2017.
[22] Information technology - Open Systems Interconnection - Basic Refer-
ence Model: The Basic Model, ISO/IEC Standard 7498-1:1994(E).
[23] An Ethernet Address Resolution Protocol, IETF Internet Standard RFC
826.
[24] P. Mooney, “40/100-Gigabit Ethernet: Watching
the Clock,” White Paper, 2009. [Online].
Available: http://www.lightwaveonline.com/articles/print/volume-26/
issue-10/applications/40100-gigabit-ethernet-watching-the-clock.html
Ferenc Nandor Janky received the MSc degree in
Electrical Engineering from BME, Budapest, Hun-
gary, in 2013. He gained experienced while working
for various telecommunication companies including
Vodafone, AITIA International Inc. and Ericsson.
His main areas of interest are network protocols,
FPGA programming and software development. Fer-
enc is currently working as a C++ software devel-
oper for an international corporate bank. He is a
member of the SmartComLab at BME TMIT.
Pal Varga is Associate Professor at BME, Hungary,
where he received his Ph.D. (2011) from. Besides,
he is director in AITIA International Inc. Earlier he
was working for Ericsson Hungary and Tecnomen
Ireland, as software design engineer and system
architect, respectively. His main research interest
include network performance measurements, root
cause analysis, fault localization, traffic classifica-
tion, end-to-end QoS and SLA issues, as well as
hardware acceleration, and Internet of Things. He is
also a member of the SmartComLab at BME TMIT.
Gbps of the described system, the FPGA serves as the base
platfor of t e time synchronization solution for the interface
cards of the Monitors.
Time synchronization of the netw k monit ring nodes are
crucial, since the analysis dep nds highly on the prop r mes
sage sequence, which is de rmined m inly by the time a ps.
Fir t, ach monitoring site ha to have a refer nce clo k
that is synchronized with oth r reference clocks at othe sites.
N tura ly, the monitoring system has to be synchr zed to
the reference cl ck avail ble at the physical site. In this
paper we call it ex rnal time synchronization, and it is
solv d by an FPGA-bas d, NTP implementation that use
d a filtering and has a clock discipli e module in or er to
output monotonous clock information. This avoids timestamps
jumping backwards, or jumping forward too much within one
step, hence the clock if the monitoring system becomes well-
regulated. Each interface at the monitoring node has to get
synchronized with this clock information. In this paper we call
it internal time synchronization, and it is implemented through
a proprietary time-synchronization protocol. Its sender, (or
master) part works in a distributor-card residing at the main
reference clock machine of the monitoring system, whereas
the receiver (or slave) parts are realized within the FPGA of
the monitoring cards.
As presented in the paper, the overall system shows sub-
nanosecond accuracy and stability, meeting the requirements
of 10 Gbps, or even 100 Gbps Ethernet-based packet monitor-
ing. The presented solution is already installed in the network-
wide, real-life monitoring at Magyar Telekom.
REFERENCES
[1] D. L. Mills, Computer Network Time Synchronization: The Network Time
Protocol. Boca Raton, FL: CRC Press, 2006.
[2] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems (PTPv2), IEEE Standard
1588-2008.
[3] P. Tatai, G. Marosi, and L. Osvath, “A flexible approach to mo-
bile telephone traffic mass measurement and analysis,” in 18th IEEE
Instrumentation and Measurement Technology Conference, Budapest,
Hungary, May 2001.
[4] D. Kozma, G. Soos, and P. Varga, “Supporting LTE Network and Service
Management through Session Data Record Analysis,” Infocommunica-
tions Journal, vol. 71, no. 2, pp. 11–16, June 2016.
[5] AITIA, “SGA-NETMON The GSM/GPRS/UMTS/LTE Network
Monitoring System,” White Paper, 2012. [Online]. Available:
http://sga.aitia.ai/pdfs/SGA-NetMon.pdf
[6] J. W. Lockwood, N. McKeown, G. Watson, G. Gibb, P. Hartke, J. Naous,
R. Raghuraman, and J. Luo, “NetFPGA - An Open Platform for Gigabit-
rate Network Switching and Routing,” in IEEE MSE, San Diego, CA,
US, 2007.
[7] N. Possley, “Traffic Management in Xilinx FPGAs,” White
Paper, 2006. [Online]. Available: http://www.xilinx.com/support/
documentation/white papers/wp244.pdf
[8] W. Jiang and V. K. Prasanna, “Scalable Packet Classification on FPGA,”
IEEE Transactions on Very Large Scale Integration Systems, vol. 20,
no. 9, pp. 1668–1680, August 2011.
[9] P. Orosz, T. Tothfalusi, and P. Varga, “C-GEP: Adaptive Network Man-
agement with Reconfigurable Hardware,” in 14th IEEE International
Symposium on Integrated Management (IM), Ottawa, Canada, 2015.
[10] SGA-GPlanar product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
gplanar/
[11] SGA-10GED product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
10ged/
[12] V. Pus, L. Kekely, and J. Korenek, “Low-Latency Modular Packet
Header Parser for FPGA,” in ACM/IEEE Symposium on Architectures for
Networking and Communications Systems, Austin, TX, USA, October
2012.
[13] P. Olaszi, “Complex Load Testing of Mobile PS and CS
Core,” EuroNOG 2012, September 2012. [Online]. Avail-
able: http://www.data.proidea.org.pl/euronog/2edycja/materials/Peter
Olaszi-Complex Load Testing of Mobile PS and CS Core.pdf
[14] G. Soos and P. Varga, “Use Cases for LTE Core Network Mass Testing,”
in 5th Mesterproba, Budapest, Hungary, 2016.
[15] P. Varga, L. Kovacs, T. Tothfalusi, and P. Orosz, “C-GEP: 100 Gbit/s
Capable, FPGA-based, Reconfigurable Networking Equipment,” in 16th
IEEE International Conference on High Performance Switching and
Routi g (HPSR), Budapest, Hungary, 2015.
[16] B. Sterzbach, “GPS-based Clock Synchronization in a Mobile, Dis-
tributed Real-Time System,” Real-Time Systems, vol. 12, no. 1, pp. 63—
-75, January 1997.
[17] J. J. Garnica, V. Moreno, I. Gonzalez, S. Lopez-Buedo, F. J. Gomez-
Arribas, and J. Aracil, “ARGOS: A GPS Time-Synchronized Network
Interface Card based on NetFPGA,” in 2nd North American NetFPGA
Developers Workshop, Stanford, CA, USA, August 2010.
[18] D. L. Mills, “Internet Time Synchronization: The Network Time Proto-
col,” IEEE Transactions on Communications, vol. 39, no. 10, pp. 1482–
1493, October 1991.
[19] IEEE Standard VHDL Language Reference Manual, IEEE Standard
1076-2008.
[20] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems, IEEE Standard 1588-
2002.
[21] F. N. Janky, “A Flexible Architecture for Protocol Implementations
within FPGAs,” in 25th Telecommunications Forum (TELFOR), Bel-
grade, Serbia, 2017.
[22] Information technology - Open Systems Interconnection - Basic Refer-
ence Model: The Basic Model, ISO/IEC Standard 7498-1:1994(E).
[23] An Ethernet Address Resolution Protocol, IETF Internet Standard RFC
826.
[24] P. Mooney, “40/100-Gigabit Ethernet: Watching
the Clock,” White Paper, 2009. [Online].
Available: http://www.lightwaveonline.com/articles/print/volume-26/
issue-10/applications/40100-gigabit-ethernet-watching-the-clock.html
Ferenc Nandor Janky received the MSc degree in
Electrical Engineering from BME, Budapest, Hun-
gary, in 2013. He gained experienced while working
for various telecommunication companies including
Vodafone, AITIA International Inc. and Ericsson.
His main areas of interest are network protocols,
FPGA programming and software development. Fer-
enc is currently working as a C++ software devel-
oper for an international corporate bank. He is a
member of the SmartComLab at BME TMIT.
Pal Varga is Associate Professor at BME, Hungary,
where he received his Ph.D. (2011) from. Besides,
he is director in AITIA International Inc. Earlier he
was working for Ericsson Hungary and Tecnomen
Ireland, as software design engineer and system
architect, respectively. His main research interest
include network performance measurements, root
cause analysis, fault localization, traffic classifica-
tion, end-to-end QoS and SLA issues, as well as
hardware acceleration, and Internet of Things. He is
also a member of the SmartComLab at BME TMIT.
bps of the described syste , the FP serves as the base
platfor of the ti e synchronization solution for the interface
cards of the onitors.
Ti e synchronization of the net ork onitoring nodes are
crucial, since the analysis depends highly on the proper es-
sage sequence, hich is deter ined ainly by the ti esta ps.
First, each onitoring site has to have a reference clock
that is synchronized ith other reference clocks at other sites.
aturally, the onitoring syste has to be synchronized to
the reference clock available at the physical site. In this
paper e call it external ti e synchronization, and it is
solved by an FP -based, TP i ple entation that use
data filtering and has a clock discipline odule in order to
output onotonous clock infor ation. This avoids ti esta ps
ju ping back ards, or ju ping for ard too uch ithin one
step, hence the clock if the onitoring syste beco es ell-
regulated. Each interface at the onitoring node has to get
synchronized ith this clock infor ation. In this paper e call
it internal ti e synchronization, and it is i ple ented through
a proprietary ti e-synchronization protocol. Its sender, (or
aster) part orks in a distributor-card residing at the ain
reference clock achine of the onitoring syste , hereas
the receiver (or slave) parts are realized ithin the FP of
the onitoring cards.
s presented in the paper, the overall syste sho s sub-
nanosecond accuracy and stability, eeting the require ents
of 10 bps, or even 100 bps Ethernet-based packet onitor-
ing. The presented solution is already installed in the net ork-
ide, real-life onitoring at agyar Teleko .
REFERENCES
[1] D. L. ills, Computer Network Time Synchronization: The Network Time
Protocol. Boca Raton, FL: CRC Press, 2006.
[2] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems (PTPv2), IEEE Standard
1588-2008.
[3] P. Tatai, G. arosi, and L. Osvath, “A flexible approach to mo-
bile telephone traffic mass measurement and analysis,” in 18th IEEE
Instrumentation and Measurement Technology Conference, Budapest,
Hungary, ay 2001.
[4] D. Kozma, G. Soos, and P. Varga, “Supporting LTE Network and Service
anagement through Session Data Record Analysis,” Infocommunica-
tions Journal, vol. 71, no. 2, pp. 11–16, June 2016.
[5] AITIA, “SGA-NET ON The GS /GPRS/U TS/LTE Network
onitoring System,” hite Paper, 2012. [Online]. Available:
http://sga.aitia.ai/pdfs/SGA-Net on.pdf
[6] J. . Lockwood, N. cKeown, G. atson, G. Gibb, P. Hartke, J. Naous,
R. Raghuraman, and J. Luo, “NetFPGA - An Open Platform for Gigabit-
rate Network Switching and Routing,” in IEEE MSE, San Diego, CA,
US, 2007.
[7] N. Possley, “Traffic anagement in Xilinx FPGAs,” hite
Paper, 2006. [Online]. Available: http://www.xilinx.com/support/
documentation/white papers/wp244.pdf
[8] . Jiang and V. K. Prasanna, “Scalable Packet Classification on FPGA,”
IEEE Transactions on Very Large Scale Integration Systems, vol. 20,
no. 9, pp. 1668–1680, August 2011.
[9] P. Orosz, T. Tothfalusi, and P. Varga, “C-GEP: Adaptive Network an-
agement with Reconfigurable Hardware,” in 14th IEEE International
Symposium on Integrated Management (IM), Ottawa, Canada, 2015.
[10] SGA-GPlanar product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
gplanar/
[11] SGA-10GED product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
10ged/
[12] V. Pus, L. Kekely, and J. Korenek, “Low-Latency odular Packet
Header Parser for FPGA,” in ACM/IEEE Symposium on Architectures for
Networking and Communications Systems, Austin, TX, USA, October
2012.
[13] P. Olaszi, “Complex Load Testing of obile PS and CS
Core,” EuroNOG 2012, September 2012. [Online]. Avail-
able: http://www.data.proidea.org.pl/euronog/2edycja/materials/Peter
Olaszi-Complex Load Testing of obile PS and CS Core.pdf
[14] G. Soos and P. Varga, “Use Cases for LTE Core Network ass Testing,”
in 5th Mesterproba, Budapest, Hungary, 2016.
[15] P. Varga, L. Kovacs, T. Tothfalusi, and P. Orosz, “C-GEP: 100 Gbit/s
Capable, FPGA-based, Reconfigurable Networking Equipment,” in 16th
IEEE International Conference on High Performance Switching and
Routing (HPSR), Budapest, Hungary, 2015.
[16] B. Sterzbach, “GPS-based Clock Synchronization in a obile, Dis-
tributed Real-Time System,” Real-Time Systems, vol. 12, no. 1, pp. 63
-75, January 1997.
[17] J. J. Garnica, V. oreno, I. Gonzalez, S. Lopez-Buedo, F. J. Gomez-
Arribas, and J. Aracil, “ARGOS: A GPS Time-Synchronized Network
Interface Card based on NetFPGA,” in 2nd North American NetFPGA
Developers Workshop, Stanford, CA, USA, August 2010.
[18] D. L. ills, “Internet Time Synchronization: The Network Time Proto-
col,” IEEE Transactions on Communications, vol. 39, no. 10, pp. 1482–
1493, October 1991.
[19] IEEE Standard VHDL Language Reference Manual, IEEE Standard
1076-2008.
[20] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems, IEEE Standard 1588-
2002.
[21] F. N. Janky, “A Flexible Architecture for Protocol Implementations
within FPGAs,” in 25th Telecommunications Forum (TELFOR), Bel-
grade, Serbia, 2017.
[22] Information technology - Open Systems Interconnection - Basic Refer-
ence Model: The Basic Model, ISO/IEC Standard 7498-1:1994(E).
[23] An Ethernet Address Resolution Protocol, IETF Internet Standard RFC
826.
[24] P. ooney, “40/100-Gigabit Ethernet: atching
the Clock,” hite Paper, 2009. [Online].
Available: http://www.lightwaveonline.com/articles/print/volume-26/
issue-10/applications/40100-gigabit-ethernet-watching-the-clock.html
Ferenc Nandor Janky received the Sc degree in
Electrical Engineering from B E, Budapest, Hun-
gary, in 2013. He gained experienced while working
for various telecommunication companies including
Vodafone, AITIA International Inc. and Ericsson.
His main areas of interest are network protocols,
FPGA programming and software development. Fer-
enc is currently working as a C++ software devel-
oper for an international corporate bank. He is a
member of the SmartComLab at B E T IT.
Pal Varga is Associate Professor at B E, Hungary,
where he received his Ph.D. (2011) from. Besides,
he is director in AITIA International Inc. Earlier he
was working for Ericsson Hungary and Tecnomen
Ireland, as software design engineer and system
architect, respectively. His main research interest
include network performance measurements, root
cause analysis, fault localization, traffic classifica-
tion, end-to-end QoS and SLA issues, as well as
hardware acceleration, and Internet of Things. He is
also a member of the SmartComLab at B E T IT.
Time Synchronization Solution for FPGA-based
Distributed Network Monitoring
INFOCOMMUNICATIONS JOURNAL
MARCH 2018 • VOLUME X • NUMBER 1 9
Gbps of the described system, the FPGA serves as the base
platform of the time synchronization solution for the interface
cards of the Monitors.
Time synchronization of the network monitoring nodes are
crucial, since the analysis depends highly on the proper mes-
sage sequence, which is determined mainly by the timestamps.
First, each monitoring site has to have a reference clock
that is synchronized with other reference clocks at other sites.
Naturally, the monitoring system has to be synchronized to
the reference clock available at the physical site. In this
paper we call it external time synchronization, and it is
solved by an FPGA-based, NTP implementation that use
data filtering and has a clock discipline module in order to
output monotonous clock information. This avoids timestamps
jumping backwards, or jumping forward too much within one
step, hence the clock if the monitoring system becomes well-
regulated. Each interface at the monitoring node has to get
synchronized with this clock information. In this paper we call
it internal time synchronization, and it is implemented through
a proprietary time-synchronization protocol. Its sender, (or
master) part works in a distributor-card residing at the main
reference clock machine of the monitoring system, whereas
the receiver (or slave) parts are realized within the FPGA of
the monitoring cards.
As presented in the paper, the overall system shows sub-
nanosecond accuracy and stability, meeting the requirements
of 10 Gbps, or even 100 Gbps Ethernet-based packet monitor-
ing. The presented solution is already installed in the network-
wide, real-life monitoring at Magyar Telekom.
REFERENCES
[1] D. L. Mills, Computer Network Time Synchronization: The Network Time
Protocol. Boca Raton, FL: CRC Press, 2006.
[2] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems (PTPv2), IEEE Standard
1588-2008.
[3] P. Tatai, G. Marosi, and L. Osvath, “A flexible approach to mo-
bile telephone traffic mass measurement and analysis,” in 18th IEEE
Instrumentation and Measurement Technology Conference, Budapest,
Hungary, May 2001.
[4] D. Kozma, G. Soos, and P. Varga, “Supporting LTE Network and Service
Management through Session Data Record Analysis,” Infocommunica-
tions Journal, vol. 71, no. 2, pp. 11–16, June 2016.
[5] AITIA, “SGA-NETMON The GSM/GPRS/UMTS/LTE Network
Monitoring System,” White Paper, 2012. [Online]. Available:
http://sga.aitia.ai/pdfs/SGA-NetMon.pdf
[6] J. W. Lockwood, N. McKeown, G. Watson, G. Gibb, P. Hartke, J. Naous,
R. Raghuraman, and J. Luo, “NetFPGA - An Open Platform for Gigabit-
rate Network Switching and Routing,” in IEEE MSE, San Diego, CA,
US, 2007.
[7] N. Possley, “Traffic Management in Xilinx FPGAs,” White
Paper, 2006. [Online]. Available: http://www.xilinx.com/support/
documentation/white papers/wp244.pdf
[8] W. Jiang and V. K. Prasanna, “Scalable Packet Classification on FPGA,”
IEEE Transactions on Very Large Scale Integration Systems, vol. 20,
no. 9, pp. 1668–1680, August 2011.
[9] P. Orosz, T. Tothfalusi, and P. Varga, “C-GEP: Adaptive Network Man-
agement with Reconfigurable Hardware,” in 14th IEEE International
Symposium on Integrated Management (IM), Ottawa, Canada, 2015.
[10] SGA-GPlanar product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
gplanar/
[11] SGA-10GED product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
10ged/
[12] V. Pus, L. Kekely, and J. Korenek, “Low-Latency Modular Packet
Header Parser for FPGA,” in ACM/IEEE Symposium on Architectures for
Networking and Communications Systems, Austin, TX, USA, October
2012.
[13] P. Olaszi, “Complex Load Testing of Mobile PS and CS
Core,” EuroNOG 2012, September 2012. [Online]. Avail-
able: http://www.data.proidea.org.pl/euronog/2edycja/materials/Peter
Olaszi-Complex Load Testing of Mobile PS and CS Core.pdf
[14] G. Soos and P. Varga, “Use Cases for LTE Core Network Mass Testing,”
in 5th Mesterproba, Budapest, Hungary, 2016.
[15] P. Varga, L. Kovacs, T. Tothfalusi, and P. Orosz, “C-GEP: 100 Gbit/s
Capable, FPGA-based, Reconfigurable Networking Equipment,” in 16th
IEEE International Conference on High Performance Switching and
Routing (HPSR), Budapest, Hungary, 2015.
[16] B. Sterzbach, “GPS-based Clock Synchronization in a Mobile, Dis-
tributed Real-Time System,” Real-Time Systems, vol. 12, no. 1, pp. 63—
-75, January 1997.
[17] J. J. Garnica, V. Moreno, I. Gonzalez, S. Lopez-Buedo, F. J. Gomez-
Arribas, and J. Aracil, “ARGOS: A GPS Time-Synchronized Network
Interface Card based on NetFPGA,” in 2nd North American NetFPGA
Developers Workshop, Stanford, CA, USA, August 2010.
[18] D. L. Mills, “Internet Time Synchronization: The Network Time Proto-
col,” IEEE Transactions on Communications, vol. 39, no. 10, pp. 1482–
1493, October 1991.
[19] IEEE Standard VHDL Language Reference Manual, IEEE Standard
1076-2008.
[20] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems, IEEE Standard 1588-
2002.
[21] F. N. Janky, “A Flexible Architecture for Protocol Implementations
within FPGAs,” in 25th Telecommunications Forum (TELFOR), Bel-
grade, Serbia, 2017.
[22] Information technology - Open Systems Interconnection - Basic Refer-
ence Model: The Basic Model, ISO/IEC Standard 7498-1:1994(E).
[23] An Ethernet Address Resolution Protocol, IETF Internet Standard RFC
826.
[24] P. Mooney, “40/100-Gigabit Ethernet: Watching
the Clock,” White Paper, 2009. [Online].
Available: http://www.lightwaveonline.com/articles/print/volume-26/
issue-10/applications/40100-gigabit-ethernet-watching-the-clock.html
Ferenc Nandor Janky received the MSc degree in
Electrical Engineering from BME, Budapest, Hun-
gary, in 2013. He gained experienced while working
for various telecommunication companies including
Vodafone, AITIA International Inc. and Ericsson.
His main areas of interest are network protocols,
FPGA programming and software development. Fer-
enc is currently working as a C++ software devel-
oper for an international corporate bank. He is a
member of the SmartComLab at BME TMIT.
Pal Varga is Associate Professor at BME, Hungary,
where he received his Ph.D. (2011) from. Besides,
he is director in AITIA International Inc. Earlier he
was working for Ericsson Hungary and Tecnomen
Ireland, as software design engineer and system
architect, respectively. His main research interest
include network performance measurements, root
cause analysis, fault localization, traffic classifica-
tion, end-to-end QoS and SLA issues, as well as
hardware acceleration, and Internet of Things. He is
also a member of the SmartComLab at BME TMIT.
Gbps of the described system, the FPGA serves as the base
platform of the time synchronization solution for the interface
cards of the Monitors.
Time synchronization of the network monitoring nodes are
crucial, since the analysis depends highly on the proper mes-
sage sequence, which is determined mainly by the timestamps.
First, each monitoring site has to have a reference clock
that is synchronized with other reference clocks at other sites.
Naturally, the monitoring system has to be synchronized to
the reference clock available at the physical site. In this
paper we call it external time synchronization, and it is
solved by an FPGA-based, NTP implementation that use
data filtering and has a clock discipline module in order to
output monotonous clock information. This avoids timestamps
jumping backwards, or jumping forward too much within one
step, hence the clock if the monitoring system becomes well-
regulated. Each interface at the monitoring node has to get
synchronized with this clock information. In this paper we call
it internal time synchronization, and it is implemented through
a proprietary time-synchronization protocol. Its sender, (or
master) part works in a distributor-card residing at the main
reference clock machine of the monitoring system, whereas
the receiver (or slave) parts are realized within the FPGA of
the monitoring cards.
As presented in the paper, the overall system shows sub-
nanosecond accuracy and stability, meeting the requirements
of 10 Gbps, or even 100 Gbps Ethernet-based packet monitor-
ing. The presented solution is already installed in the network-
wide, real-life monitoring at Magyar Telekom.
REFERENCES
[1] D. L. Mills, Computer Network Time Synchronization: The Network Time
Protocol. Boca Raton, FL: CRC Press, 2006.
[2] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems (PTPv2), IEEE Standard
1588-2008.
[3] P. Tatai, G. Marosi, and L. Osvath, “A flexible approach to mo-
bile telephone traffic mass measurement and analysis,” in 18th IEEE
Instrumentation and Measurement Technology Conference, Budapest,
Hungary, May 2001.
[4] D. Kozma, G. Soos, and P. Varga, “Supporting LTE Network and Service
Management through Session Data Record Analysis,” Infocommunica-
tions Journal, vol. 71, no. 2, pp. 11–16, June 2016.
[5] AITIA, “SGA-NETMON The GSM/GPRS/UMTS/LTE Network
Monitoring System,” White Paper, 2012. [Online]. Available:
http://sga.aitia.ai/pdfs/SGA-NetMon.pdf
[6] J. W. Lockwood, N. McKeown, G. Watson, G. Gibb, P. Hartke, J. Naous,
R. Raghuraman, and J. Luo, “NetFPGA - An Open Platform for Gigabit-
rate Network Switching and Routing,” in IEEE MSE, San Diego, CA,
US, 2007.
[7] N. Possley, “Traffic Management in Xilinx FPGAs,” White
Paper, 2006. [Online]. Available: http://www.xilinx.com/support/
documentation/white papers/wp244.pdf
[8] W. Jiang and V. K. Prasanna, “Scalable Packet Classification on FPGA,”
IEEE Transactions on Very Large Scale Integration Systems, vol. 20,
no. 9, pp. 1668–1680, August 2011.
[9] P. Orosz, T. Tothfalusi, and P. Varga, “C-GEP: Adaptive Network Man-
agement with Reconfigurable Hardware,” in 14th IEEE International
Symposium on Integrated Management (IM), Ottawa, Canada, 2015.
[10] SGA-GPlanar product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
gplanar/
[11] SGA-10GED product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
10ged/
[12] V. Pus, L. Kekely, and J. Korenek, “Low-Latency Modular Packet
Header Parser for FPGA,” in ACM/IEEE Symposium on Architectures for
Networking and Communications Systems, Austin, TX, USA, October
2012.
[13] P. Olaszi, “Complex Load Testing of Mobile PS and CS
Core,” EuroNOG 2012, September 2012. [Online]. Avail-
able: http://www.data.proidea.org.pl/euronog/2edycja/materials/Peter
Olaszi-Complex Load Testing of Mobile PS and CS Core.pdf
[14] G. Soos and P. Varga, “Use Cases for LTE Core Network Mass Testing,”
in 5th Mesterproba, Budapest, Hungary, 2016.
[15] P. Varga, L. Kovacs, T. Tothfalusi, and P. Orosz, “C-GEP: 100 Gbit/s
Capable, FPGA-based, Reconfigurable Networking Equipment,” in 16th
IEEE International Conference on High Performance Switching and
Routing (HPSR), Budapest, Hungary, 2015.
[16] B. Sterzbach, “GPS-based Clock Synchronization in a Mobile, Dis-
tributed Real-Time System,” Real-Time Systems, vol. 12, no. 1, pp. 63—
-75, January 1997.
[17] J. J. Garnica, V. Moreno, I. Gonzalez, S. Lopez-Buedo, F. J. Gomez-
Arribas, and J. Aracil, “ARGOS: A GPS Time-Synchronized Network
Interface Card based on NetFPGA,” in 2nd North American NetFPGA
Developers Workshop, Stanford, CA, USA, August 2010.
[18] D. L. Mills, “Internet Time Synchronization: The Network Time Proto-
col,” IEEE Transactions on Communications, vol. 39, no. 10, pp. 1482–
1493, October 1991.
[19] IEEE Standard VHDL Language Reference Manual, IEEE Standard
1076-2008.
[20] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems, IEEE Standard 1588-
2002.
[21] F. N. Janky, “A Flexible Architecture for Protocol Implementations
within FPGAs,” in 25th Telecommunications Forum (TELFOR), Bel-
grade, Serbia, 2017.
[22] Information technology - Open Systems Interconnection - Basic Refer-
ence Model: The Basic Model, ISO/IEC Standard 7498-1:1994(E).
[23] An Ethernet Address Resolution Protocol, IETF Internet Standard RFC
826.
[24] P. Mooney, “40/100-Gigabit Ethernet: Watching
the Clock,” White Paper, 2009. [Online].
Available: http://www.lightwaveonline.com/articles/print/volume-26/
issue-10/applications/40100-gigabit-ethernet-watching-the-clock.html
Ferenc Nandor Janky received the MSc degree in
Electrical Engineering from BME, Budapest, Hun-
gary, in 2013. He gained experienced while working
for various telecommunication companies including
Vodafone, AITIA International Inc. and Ericsson.
His main areas of interest are network protocols,
FPGA programming and software development. Fer-
enc is currently working as a C++ software devel-
oper for an international corporate bank. He is a
member of the SmartComLab at BME TMIT.
Pal Varga is Associate Professor at BME, Hungary,
where he received his Ph.D. (2011) from. Besides,
he is director in AITIA International Inc. Earlier he
was working for Ericsson Hungary and Tecnomen
Ireland, as software design engineer and system
architect, respectively. His main research interest
include network performance measurements, root
cause analysis, fault localization, traffic classifica-
tion, end-to-end QoS and SLA issues, as well as
hardware acceleration, and Internet of Things. He is
also a member of the SmartComLab at BME TMIT.
Fig. 10. Measurement result – a typical example of an NTP packet containing four timestamps
Fig. 11. Measurement result – plot of derived offset-measurement data
The ∼1 ns accuracy over the internal synchronization bus
satisfies the criteria of any monitoring system with 1, 10 or
even 100 Gbit/s Ethernet, since packet inter-arrival time even
of the latter case is 6.72ns [24] – almost one magnitude greater
than the theoretical accuracy of the presented, implemented
and verified time synchronization method.
As a real-life verification, the above described time-
synchronization system has been put into operation at Magyar
Telekom. The system hardware (with its FPGA firmware) has
been installed beside the SGA-7N network monitoring system,
and showed the expected result. The system provides accurate
time information to the monitoring cards ever since, and it is
planned to be expanded for covering all related monitoring
cards, network-wide.
VI. CONCLUSION
In this paper, we introduced a general time synchronization
solution for a high performa e, lossless network monitoring
system called SGA-7N that s based o a reconfigurable archi-
tecture. The probes of th system are called “Monitors”, which
consists of three main building blocks: a high performance
Field Programmable Gate Array (FPGA)-based custom hard-
ware platform, a firmware dedicated for network onitoring,
and the probe sof ware. The reconfigurable property of t e
FPGA chip enables to turn the Monitor hardware platform
into a high performance networking device – among others, a
network monitoring probe. Beside supporting distr buted and
lossless packet level monitoring of Ethernet links for 1 or 10
Gbps of the described system, the FPGA serves as the base
platform of the time synchronization solution for the interf c
cards of the Monitors.
Time synchr ization of the network monitoring nodes are
crucial, since the analysis depe ds highly the proper mes-
sage sequence, which is determi ed mainly by the timesta ps.
Fir t, each monitoring site has to have a reference clock
that is synchronized with oth r reference clocks at oth r sites.
N turall , t e monitoring system has to be synchronized to
the referenc cl ck available at the physical site. In this
paper w all it external time synchronization, and it is
solved by an FPGA-based, NTP implementation that use
data filtering and has a clock discipline odule in order to
output mo otonous clock information. This avoids timestamps
jumping backwards, or jumping forward too much within one
step, hence the clock if the monit ring syste becomes well-
regulat d. Each interface at the mo itoring node has to get
synchronized with this lock infor ation. In this paper we call
it internal time sync ronizatio , and it is implemented through
a proprietary time-synchronization protocol. Its sender, (or
master) part works in a distribut r-card residing at the main
refer nce clock machine of the monitoring system, whereas
the r ceiver (or slave) parts ar realized within the FPGA of
t monitoring cards.
As presented in the paper, the overall system shows sub-
nanosecond accuracy and stability, meeting the requirements
of 10 Gbps, or even 100 Gbps Ethernet-based packet monitor-
ing. The resented solution is already inst lled in the network-
wide, r al-lif monitoring at Magyar Telekom.
REFERENCES
[1] D. L. Mills, Computer Network Time Synchronization: The Network Time
Protocol. Boca Raton, FL: CRC Press, 2006.
[2] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems (PTPv2), IEEE Standard
1588-2008.
[3] P. Tatai, G. Marosi, and L. Osvath, “A flexible approach to mo-
bile telephone traffic mass measurement and analysis,” in 18th IEEE
Instrumentation and Measurement Technology Conference, Budapest,
Hungary, May 2001.
[4] D. Kozma, G. Soos, and P. Varga, “Supporting LTE Network and Service
Management through Session Data Record Analysis,” Infocommunica-
tions Journal, vol. 71, no. 2, pp. 11–16, June 2016.
[5] AITIA, “SGA-NETMON The GSM/GPRS/UMTS/LTE Network
Monitoring System,” White Paper, 2012. [Online]. Available:
http://sga.aitia.ai/pdfs/SGA-NetMon.pdf
[6] J. W. Lockwood, N. McKeown, G. Watson, G. Gibb, P. Hartke, J. Naous,
R. Raghuraman, and J. Luo, “NetFPGA - An Open Platform for Gigabit-
rate Network Switching and Routing,” in IEEE MSE, San Diego, CA,
US, 2007.
[7] N. Possley, “Traffic Management in Xilinx FPGAs,” White
Paper, 2006. [Online]. Available: http://www.xilinx.com/support/
documentation/white papers/wp244.pdf
[8] W. Jiang and V. K. Prasanna, “Scalable Packet Classification on FPGA,”
IEEE Transactions on Very Large Scale Integration Systems, vol. 20,
no. 9, pp. 1668–1680, August 2011.
[9] P. Orosz, T. Tothfalusi, and P. Varga, “C-GEP: Adaptive Network Man-
agement with Reconfigurable Hardware,” in 14th IEEE International
Symposium on Integrated Management (IM), Ottawa, Canada, 2015.
[10] SGA-GPlanar product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
gplanar/
[11] SGA-10GED product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
10ged/
[12] V. Pus, L. Kekely, and J. Korenek, “Low-Latency Modular Packet
Header Parser for FPGA,” in ACM/IEEE Symposium on Architectures for
Networking and Communications Systems, Austin, TX, USA, October
2012.
[13] P. Olaszi, “Complex Load Testing of Mobile PS and CS
Core,” EuroNOG 2012, September 2012. [Online]. Avail-
able: http://www.data.proidea.org.pl/euronog/2edycja/materials/Peter
Olaszi-Complex Load Testing of Mobile PS and CS Core.pdf
[14] G. Soos and P. Varga, “Use Cases for LTE Core Network Mass Testing,”
in 5th Mesterproba, Budapest, Hungary, 2016.
[15] P. Varga, L. Kovacs, T. Tothfalusi, and P. Orosz, “C-GEP: 100 Gbit/s
Capable, FPGA-based, Reconfigurable Networking Equipment,” in 16th
IEEE International Conference on High Performance Switching and
Routing (HPSR), Budapest, Hungary, 2015.
[16] B. Sterzbach, “GPS-based Clock Synchronization in a Mob le, Dis-
tributed Real-Time System,” Real-Time Systems, vol. 12, no. 1, pp. 63—
-75, January 1997.
[17] J. J. Garnica, V. Moreno, I. Gonzalez, S. Lopez-Buedo, F. J. Gomez-
Arribas, and J. Aracil, “ARGOS: A GPS Time-Synchronized Network
Interface Card based on NetFPGA,” in 2nd North American NetFPGA
Developers Workshop, Stanford, CA, USA, August 2010.
[18] D. L. Mills, “Internet Time Synchronization: The Network Time Proto-
col,” IEEE Transactions on Communications, vol. 39, no. 10, pp. 1482–
1493, October 1991.
[19] IEEE Standard VHDL Language Reference Manual, IEEE Standard
1076-2008.
[20] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems, IEEE Standard 1588-
2002.
[21] F. N. Janky, “A Flexible Architecture for Protocol Implementations
within FPGAs,” in 25th Telecommunications Forum (TELFOR), Bel-
grade, Serbia, 2017.
[22] Information technology - Open Systems Interconnection - Basic Refer-
ence Model: The Basic Model, ISO/IEC Standard 7498-1:1994(E).
[23] An Ethernet Address Resolution Protocol, IETF Internet Standard RFC
826.
[24] P. Mooney, “40/100-Gigabit Ethernet: Watching
the Clock,” White Paper, 2009. [Online].
Available: http://www.lightwaveonline.com/articles/print/volume-26/
issue-10/applications/40100-gigabit-ethernet-watching-the-clock.html
Ferenc Nandor Janky received the MSc degree in
Electrical Engineering from BME, Budapest, Hun-
gary, in 2013. He gained experienced while working
for various telecommunication companies including
Vodafone, AITIA International Inc. and Ericsson.
His main areas of interest are network protocols,
FPGA programming and software development. Fer-
enc is currently working as a C++ software devel-
oper for an international corporate bank. He is a
member of the SmartComLab at BME TMIT.
Pal Varga is Associate Professor at BME, Hungary,
where he received his Ph.D. (2011) from. Besides,
he is director in AITIA International Inc. Earlier he
was working for Ericsson Hungary and Tecnomen
Ireland, as software design engineer and system
architect, respectively. His main research interest
include network performance measurements, root
cause analysis, fault localization, traffic classifica-
tion, end-to-end QoS and SLA issues, as well as
hardware acceleration, and Internet of Things. He is
also a member of the SmartComLab at BME TMIT.
bps of the described syste , the FP serves as the base
platfor of the ti e synchronization solution for the interface
cards of the onitors.
Ti e synchronization of the net ork onitoring nodes are
crucial, since the analysis depends highly on the proper es-
sage sequence, hich is deter ined ainly by the ti estamps.
First, each onitoring site has to have a reference clock
that is synchronized ith other reference clocks at other sites.
aturally, the onitoring syste has to be synchronized to
the reference clock available at the physical site. In this
paper e call it external ti e synchronization, and it is
solved by an FP -based, TP i ple entation that use
data filtering and has a clock discipline module in order to
output onotonous clock infor ation. This avoids ti esta ps
ju ping back ards, or ju ping for ard too uch ithin one
step, hence the clock if the onitoring system beco es ell-
regulated. Each interface at the onitoring node has to get
synchronized ith this clock information. In this paper e call
it internal ti e synchronization, and it is i ple ented through
a proprietary ti e-synchronization protocol. Its sender, (or
aster) part orks in a distributor-card residing at the ain
reference clock achine of the onitoring syste , hereas
the receiver (or slave) parts are realized ithin the FP of
the onitoring cards.
s presented in the paper, the overall syste sho s sub-
nanosecond accuracy and stability, eeting the require ents
of 10 bps, or even 100 bps Ethernet-based packet onitor-
ing. The presented solution is already installed in the net ork-
ide, real-life onitoring at agyar Teleko .
REFERENCES
[1] D. L. ills, Computer Network Time Synchronization: The Network Time
Protocol. Boca Raton, FL: CRC Press, 2006.
[2] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems (PTPv2), IEEE Standard
1588-2008.
[3] P. Tatai, G. arosi, and L. Osvath, “A flexible approach to mo-
bile telephone traffic mass measurement and analysis,” in 18th IEEE
Instrumentation and Measurement Technology Conference, Budapest,
Hungary, ay 2001.
[4] D. Kozma, G. Soos, and P. Varga, “Supporting LTE Network and Service
anagement through Session Data Record Analysis,” Infocommunica-
tions Journal, vol. 71, no. 2, pp. 11–16, June 2016.
[5] AITIA, “SGA-NET ON The GS /GPRS/U TS/LTE Network
onitoring System,” hite Paper, 2012. [Online]. Available:
http://sga.aitia.ai/pdfs/SGA-Net on.pdf
[6] J. . Lockwood, N. cKeown, G. atson, G. Gibb, P. Hartke, J. Naous,
R. Raghuraman, and J. Luo, “NetFPGA - An Open Platform for Gigabit-
rate Network Switching and Routing,” in IEEE MSE, San Diego, CA,
US, 2007.
[7] N. Possley, “Traffic anagement in Xilinx FPGAs,” hite
Paper, 2006. [Online]. Available: http://www.xilinx.com/support/
documentation/white papers/wp244.pdf
[8] . Jiang and V. K. Prasanna, “Scalable Packet Classification on FPGA,”
IEEE Transactions on Very Large Scale Integration Systems, vol. 20,
no. 9, pp. 1668–1680, August 2011.
[9] P. Orosz, T. Tothfalusi, and P. Varga, “C-GEP: Adaptive Network an-
agement with Reconfigurable Hardware,” in 14th IEEE International
Symposium on Integrated Management (IM), Ottawa, Canada, 2015.
[10] SGA-GPlanar product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
gplanar/
[11] SGA-10GED product description, Aitia International Inc., accessed:
2017-12-30. [Online]. Available: http://www.fpganetworking.com/
10ged/
[12] V. Pus, L. Kekely, and J. Korenek, “Low-Latency odular Packet
Header Parser for FPGA,” in ACM/IEEE Symposium on Architectures for
Networking and Communications Systems, Austin, TX, USA, October
2012.
[13] P. Olaszi, “Complex Load Testing of obile PS and CS
Core,” EuroNOG 2012, September 2012. [Online]. Avail-
able: http://www.data.proidea.org.pl/euronog/2edycja/materials/Peter
Olaszi-Complex Load Testing of obile PS and CS Core.pdf
[14] G. Soos and P. Varga, “Use Cases for LTE Core Network ass Testing,”
in 5th Mesterproba, Budapest, Hungary, 2016.
[15] P. Varga, L. Kovacs, T. Tothfalusi, and P. Orosz, “C-GEP: 100 Gbit/s
Capable, FPGA-based, Reconfigurable Networking Equipment,” in 16th
IEEE International Conference on High Performance Switching and
Routing (HPSR), Budapest, Hungary, 2015.
[16] B. Sterzbach, “GPS-based Clock Synchronization in a obile, Dis-
tributed Real-Time System,” Real-Time Systems, vol. 12, no. 1, pp. 63
-75, January 1997.
[17] J. J. Garnica, V. oreno, I. Gonzalez, S. Lopez-Buedo, F. J. Gomez-
Arribas, and J. Aracil, “ARGOS: A GPS Time-Synchronized Network
Interface Card based on NetFPGA,” in 2nd North American NetFPGA
Developers Workshop, Stanford, CA, USA, August 2010.
[18] D. L. ills, “Internet Time Synchronization: The Network Time Proto-
col,” IEEE Transactions on Communications, vol. 39, no. 10, pp. 1482–
1493, October 1991.
[19] IEEE Standard VHDL Language Reference Manual, IEEE Standard
1076-2008.
[20] IEEE Standard for a Precision Clock Synchronization Protocol for
Networked Measurement and Control Systems, IEEE Standard 1588-
2002.
[21] F. N. Janky, “A Flexible Architecture for Protocol Implementations
within FPGAs,” in 25th Telecommunications Forum (TELFOR), Bel-
grade, Serbia, 2017.
[22] Information technology - Open Systems Interconnection - Basic Refer-
ence Model: The Basic Model, ISO/IEC Standard 7498-1:1994(E).
[23] An Ethernet Address Resolution Protocol, IETF Internet Standard RFC
826.
[24] P. ooney, “40/100-Gigabit Ethernet: atching
the Clock,” hite Paper, 2009. [Online].
Available: http://www.lightwaveonline.com/articles/print/volume-26/
issue-10/applications/40100-gigabit-ethernet-watching-the-clock.html
Ferenc Nandor Janky received the Sc degree in
Electrical Engineering from B E, Budapest, Hun-
gary, in 2013. He gained experienced while working
for various telecommunication companies including
Vodafone, AITIA International Inc. and Ericsson.
His main areas of interest are network protocols,
FPGA programming and software development. Fer-
enc is currently working as a C++ software devel-
oper for an international corporate bank. He is a
member of the SmartComLab at B E T IT.
Pal Varga is Associate Professor at B E, Hungary,
where he received his Ph.D. (2011) from. Besides,
he is director in AITIA International Inc. Earlier he
was working for Ericsson Hungary and Tecnomen
Ireland, as software design engineer and system
architect, respectively. His main research interest
include network performance measurements, root
cause analysis, fault localization, traffic classifica-
tion, end-to-end QoS and SLA issues, as well as
hardware acceleration, and Internet of Things. He is
also a member of the SmartComLab at B E T IT.
Ferenc Nandor Janky received the MSc degree 
in Electrical Engineering from BME, Budapest, 
Hungary, in 2013. He gained experienced while 
working for various telecommunication ompanies 
including Vodafon , AITIA International Inc. and 
Ericsson. His main ar as of inter st are network 
protocols, FPGA progr mming d s ftware 
development. Ferenc is currently working as a C++ 
software developer for an international corporate 
bank. He is a member f the Sma tComLab at BME 
TMIT.
Pal Varga is Associate Professor at BME, Hungary, 
where he received his Ph.D. (2011) from. Besides, 
he is director in AITIA International Inc. Earlier he 
was working for Ericsson Hungary and Tecnomen 
Ireland, as software design engineer and system 
architect, respectively. His main research interest 
include network performanc  me surements, root 
cause analysis, fault localization, tr ffic classification, 
end-to-end QoS and SL  i sues, as well as hardware 
acceleration, and Int rnet of Th gs. He is also a 
member of the SmartComL b at BME TMIT.
