Abstract-Industrial networks for distributed monitoring, control, and automation purposes require high-accuracy clock synchronization in topologies including long chains of cascaded nodes. Unfortunately, accuracy typically degrades as the number of devices and the distance from the synchronization reference node (i.e., the master or grandmaster) grows, because of the accumulation of multiple uncertainty contributions. To mitigate this problem, the so-called transparent clocks are used in some synchronization protocols, such as the precision transparent clock protocol used in PROFINET IO isochronous real time networks and the precision time protocol version 2, standardized as IEEE 1588-2008. In this paper, an optimal servo-clock in the mean square sense is proposed. The controller relies on both a Kalman filter that estimates the clock state difference with respect to the master and a static-state feedback assuring mean square stability even under the effect of significant fluctuations of the synchronization period. Several multiparametric simulation results in a case study based on the features of PROFINET IO devices confirm that excellent performance can be achieved with the proposed approach.
I. INTRODUCTION

C
LOCK synchronization of networked devices is useful in several applications, such as industrial automation [1] , [2] , finance [3] , and smart grid monitoring and control [4] . Clock synchronization can be implemented either with hardware-only [5] , or purely software resources [6] . While best performances are typically achieved in Ethernetbased wired networks, the case of wireless solutions has gained an increasing interest in the last years [7] , [8] . The precision time protocol (PTP) (along with its profiles) is at the moment one of the most diffused and accepted solutions to achieve clock synchronization in distributed systems [9] . In various industrial contexts, real-time Ethernet (RTE) communication protocols rely on clock synchronization to make distributed control systems and automation plants faster, more robust and more precise. RTE solutions coincide with standard Ethernet at the physical layer, but they include special MAC layers able both to schedule packet transmission and to manage data traffic according to different priority requirements at the application level.
As known, synchronization uncertainty in long chains of nodes unavoidably grows due to the accumulation of multiple contributions, such as phase and frequency noises, imperfect propagation and communication delay compensation, variable traffic conditions, and packet losses. To mitigate this problem, the state of each clock as well as line and bridge delays have to be estimated with high accuracy [2] . A classic clock state estimator for synchronization purposes is the Kalman filter (KF). Its standalone behavior, along with its advantages and disadvantages, has been thoroughly analyzed as a function of different parameters, both with and without a servo-clock [10] . Scalability and performance issues in the case of multiple cascaded nodes are explicitly addressed in [11] , where a space-state model and a different KF for PTP synchronization is introduced. However, not all uncertainty contributions are included in that model. Further studies on the same topic, are instead reported in [12] and [13] , where an estimator based on a three-state model (including the local time, the clock rate, and the oscillator frequency drift) has been proposed and validated. Using this model, the state of each clock can be estimated even under the influence of time-varying and harsh environmental conditions. This paper considerably extends the theoretical study and the results presented in [14] , which indeed is mainly focused on the case of PROFINET IO networks. In this paper, the oscillator frequency drift is neglected, since it is not very relevant at room temperature and over reasonably short time intervals [15] . Moreover, an additional optimal controller is used to discipline every clock of the chain. This idea is not completely new, since it was already adopted in [16] , where a KF is combined with an optimal linear quadratic regulator (LQR) to achieve better synchronization performance. However, in that model the synchronization period jitter due to switch behavior and network traffic was not considered. From a control perspective, the time variability of measured quantities dramatically affects the quality of control (QoC) and sometimes it may lead to instability [17] . To tackle this problem, different delay-resilient network solutions [18] , [19] , event-based approaches [20] - [22] , or anytime controllers [23] , [24] have been proposed. Stemming from these results, a novel 0018-9456 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
controller that explicitly takes into account synchronization period variability and guarantees a certified level of QoC is defined in this paper. In the following, at first, in Section III, a quite general model describing clock behavior as well as communication delays is introduced. Then, in Section IV the clock state, line delay, and bridge delay estimators are defined and justified. Section V deals with the controller design criteria. Finally, in Section VI some meaningful simulation results based on the features of PROFINET IO networks and extracted from some real hardware devices are reported. Such results are compared also with the performances of more typical servo clocks based on a proportional-integral (PI) controller, under different jitter conditions.
II. SYNCHRONIZATION PROTOCOLS OVERVIEW
In all Ethernet-based networks where clock synchronization is based on either PTP or precision transparent clock protocol (PTCP), periodic messages or frames containing a reference timestamp are forwarded along the branches of a tree (rooted in the master or grandmaster and extracted from the original topology) till reaching the leaves of the network. Often, just the so-called Sync frames are propagated throughout the network with a nominal period T s (one-step clock model). However, to reduce timestamping uncertainty on the transmission side, the so-called two-step model can be adopted. Using this approach, the local time value when a Sync frame is physically put on the wire is recorded and it is encapsulated into a subsequent Follow_up message, which is sent as soon as possible, i.e., immediately after the corresponding Sync (namely with a nominal period T s as well). Thus, clock synchronization relies on the timestamp included in the Follow_up frame and the value in the Sync is discarded. This approach generally improves synchronization accuracy at the expense of a larger network traffic. Of course, any Sync/Follow_up message experiences a latency that depends on: 1) the type and number of switches crossed my the message; 2) the amount of data traffic handled by every switch; and 3) the line delays between pairs of switches.
In the presence of cascaded nodes, two different general policies can be used to synchronize the leaves of the tree to the master or grandmaster. The first approach is used in both PTPv1 and PTPv2 when the so-called boundary clocks (BC) are employed [9] , and it relies on the progressive synchronization of the chain of clocks along the branches of the tree. In this case, the timestamps appended to Sync or Follow_up frames are no longer the grandmaster values, since they are read from the local clock after it is disciplined so as to be synchronized to the clock of the father node. Also, the delay between each pair of cascaded nodes is estimated and compensated by measuring half of the round trip time (RTT) associated with the exchange of any pair of frames Delay_Req/Delay_Res. This exchange occurs with a period T dq . Generally, T dq ≤ T s , because the network topology is assumed to be fixed or just slowly changing. Hence, frequent line delay measurements are unnecessary.
The second synchronization policy relies on the so-called transparent clocks (TCs). Such nodes are not necessarily synchronized to the master, but they are able to measure the ingress-egress message latency spent in crossing the node (namely the bridge delay). Also, they append the total delay accumulated from previous TCs in a specific field of the frame to be forwarded, possibly compensating the frequency offset between clocks.
The total line delay between the master and a leaf node can be estimated periodically in two ways, i.e., by exchanging separate Delay_Req/Delay_Res frames between pairs of consecutive nodes so that every node can compensate the line delay just from the previous node (peer-to-peer approach) or by forwarding every pair of Delay_Req/Delay_Res frames from a leaf node to the master and backwards through the whole chain, while compensating the intermediate bridge delays, as it is done for Sync/Follow_up frames (end-toend approach). The former strategy is used both in PTCP and PTPv2 and is generally more flexible and scalable. The latter is described just in PTPv2, but it is not very used in practice, as it may create potential traffic bottlenecks around the synchronization master collecting the Delay_Res frames from all leaf nodes. Also, in networks with long linear paths, the solution based on peer-to-peer TCs is preferable to the use of BCs, since cascaded servo-clocks may introduce persistent and large fluctuations, which degrade convergence time and synchronization accuracy [25] . Therefore, in the rest of this paper, just the case of chains of peer-to-peer TCs will be considered. Moreover, for the sake of simplicity, only onestep clocks will be used (i.e., without using Follow_up frames). However, the proposed approach can be easily extended to the case of two-step clocks.
III. MODEL DESCRIPTION
Many industrial networks for automation purposes (e.g., those based on PROFINET IO) exhibit a starred tree topology, as shown in Fig. 1 . If we assume that each network node is provided with its own clock and that flicker phase and frequency noises have a negligible effect over short time intervals [10] , the behavior of the i th clock (with i = 0, . . . , N) can be described by the following linear systeṁ
where
1) t denotes the ideal time on an ideally perfect timescale.
2)
T is the system state vector composed by the normalized clock rate ρ i (t) (i.e., ideally equal to one) and by the time τ i (t) measured by the clock.
causing random walk and white frequency fluctuations, respectively. Possible nonzero mean values of such variables are responsible for systematic linear frequency and time drifts, respectively. It is worth emphasizing that the assumption of normally distributed increments for the state variables is in agreement with the most sophisticated clock models available in the literature such as in [15] . However, as stated in Section I, only two state variables are used in (1), because the dynamic of clock acceleration/deceleration due to aging is negligible over short time intervals. 4) s i (t) is a further input modeling the influence of different environmental factors (e.g., temperature or vibrations) on the clock rate.
T is the potential control action on both the clock offset and the clock rate if the clock is disciplined by a controller (otherwise this input is 0). 6) c i (t) is the system output, namely the timestamp measured by clock i anytime a message is sent or received by node i . 7) i (t) represents the total timestamping error given by the superimposition of multiple systematic and random contributions such as finite clock resolution, oscillator white phase noise, jitter, and delays introduced at the MAC and/or at the physical layer anytime a message is sent or received at time t. It is important to highlight that the values of i (t) generally differ when a message is sent or received. In particular, the receiving timestamps are often affected by a larger uncertainty because of the additional time required by the transceiver for symbol synchronization, i.e., to detect when the frame actually begins. Also, as explained in Section II, the time spent by the various synchronization frames to cross switches and connections between nodes must be properly estimated and compensated. If b j (k) is the bridge delay when switch j is crossed by the kth Sync message and l i j (k) denotes the corresponding onehop line delay between nodes j and i , the total communication latency between the synchronization master (in the following denoted with index 0) and one of the input ports of node i is given by
with δ 00 (k) = 0 and b 0 (k) = 0 ∀k by definition. In practice, b j (k) may exhibit significant random fluctuations due to message buffering, priority-related issues, variable traffic conditions and frame size. On the contrary, l i j (k) is quite deterministic if the network topology is fixed, but l i j (k) can differ from l j i (k) because of cable skewness and other asymmetries at the physical layer. It has to be noted that (2) can be used iteratively, to compute all the delays at the input ports throughout the network. Therefore, in Section IV the technique to estimate the individual values of b j (k) and l i j (k) is described.
IV. ESTIMATORS DEFINITION
System (1) describes a continuous-time clock model. However, node synchronization occurs only at discrete times, i.e., when a Sync frame is actually received by some node. This means that the synchronization model is event-based and should be discretized in time with a nominal period T s . Assume, for the sake of simplicity, but without loss of generality, that a linear path rooted in the grandmaster, composed by L nodes (with L ≤ N) and ending in a leaf of the tree is extracted from the network shown in Fig. 1 . Anytime a message has to be processed by some node, various types of data are required to identify univocally the instant at which this event occurs: 1) the timescale on which the time of the event is measured; 2) whether the message is sent or received; 3) the identification number of the node transmitting or receiving the message; 4) the sequence number of the message to be processed. Since in (1) t denotes the ideal timescale, in the following we will refer tot s be the instants when the mth Delay_Req and Delay_Res frames are either sent or received by node i , respectively. Generally, k = m because T dq ≤ T s , as explained in Section II. In this respect, it is worth emphasizing that the synchronization period T s changes over time and in different points of the network, because it depends on the position of node i , as well as on data traffic and protocolrelated random communication latencies. Hence, the actual duration (on an ideal timescale) of the kth synchronization period on node i is
represents the difference between the state of clock i and the state of the synchronization master when the kth timestamp is received by clock i , and if the control input u i (k) is constant between subsequent Sync frames (with u 0 (k) = 0 by definition), it can be easily proved that the clock error model associated to (1) and discretized with period T s i k is given by [26] 
) cannot be measured directly, because t s i k refers to the instant when the Sync message is received by node i , which is unknown to the master. However, since t s
In practice, δ i0 (k) has to be estimated on the timescale of the master clock. In the rest of this paper, the hat symbol and the i superscript are used to denote that a certain quantity is estimated by node i . For instance, the total communication delay from node 0 to node i estimated on the master timescale is derived from (2)
where, without loss of generality, j in this case refers to the node just preceding node i in the chain and
is the estimated bridge delay of node j resulting from the difference of the egress-ingress timestamps associated to the kth Sync frame, multiplied by the rate compensation factor given by the ratio between the rate of the master and the estimated rate of clock j . Consider that all the clock rate estimates are initialized to 1, but they are generally different from 1 because of clock nonidealities. The value ofρ 0 (t) instead, is identically equal to 1 by definition because node 0 is the synchronization master of the network. Similarly, under the assumption that only peer-to-peer TCs are used, the line delay between nodes j and i on the master timescale is given bŷ
wherê
is the line delay estimated by node i on its own timescale by halving the RTT that results from the exchange of the mth Delay_Req/Delay_Res message pair. Notice that the rightmost term at the numerator of (12) represents the rate compensation factor between nodes i and j and it is used to map the timestamp values measured by node j on the timescale of node i . Of course, as noted previously,δ 0 j 0 (k) in (9) is computed by applying iteratively expressions (10)- (12) .
If the random variable η 0i (k) is supposed to include the uncertainty contributions associated with the measurement both ofδ 0 j 0 (k) andl 0 i j (m), then (8) can be also rewritten as c 0 t
and the measurement uncertainty term w i (k) in (3) has to be replaced byw i (k) = w i (k) + η 0i (k). Therefore, if i) the influence of the environmental factors is negligible (i.e., q i (k) ≈ 0) and ii) the uncertainty contributionsw i (k) and v i (k) are weakly correlated, then the state of (3) can be estimated by a KF, even if the KF can be hardly regarded as optimal in the case considered [12] . In particular, the prediction equations arê
where e + i (k+1) is the predicted state,ŷ
is the predicted output vector, P i (k) and P + i (k + 1) are the estimated and predicted state covariance matrices, respectively, and Q i (k) is the covariance matrix of v i (k), given by
where σ ρ i and σ τ i are the standard deviations of the continuous-time noise terms in n i (t) as defined in (1). Dually, the update equations of the KF arê
wherē
is the Kalman gain and D i (k+1) is the variance ofw i (k+1), which results from the sum of the variance of w i (k +1) due to the overall timestamping uncertainty [as defined in (8)] and of the variance of η 0i (k+1) associated to the measurement of bridge and line delays up to the i th node [according to (2) ]. It is worth emphasizing that the proposed KF works also under the effect of sudden changes of the environmental quantities [i.e., when q i (k) = 0], provided that a suitable fudge factor is introduced to handle such events [13] .
V. SERVO-CLOCK DESIGN
Although servo-clocks are not always strictly required for TC implementation, the use of a controller to discipline the local clock is often welcome and useful in practice [27] . In particular, the clock state resulting from the KF can be used not only to compute the rate compensation factors for bridge delay and line delay estimation [28] [e.g., from (10) and (11)], but also to drive an LQR controller, which makes the elements of e i (·) converge asymptotically to zero as k → +∞, with uncertainty given by the KF. While, as stated in Section I, other solutions proposed in the literature usually assume that the synchronization period is fixed and known [16] , T s may actually fluctuate as described in Section III. For instance, according to the Standard IEEE 1588:2008, the time interval between two consecutive Sync frames is allowed to change within ±30% of the nominal value with 90% confidence [9] . This phenomenon is of primary importance for control design, since such a time variability could lead to instability or, at least, to poor closed-loop performance [17] . Moreover, the control action cannot be applied on an ideal timescale, since every controller can rely only on the timescale of the freerunning TC. As a consequence, a precise analysis of the control model is needed.
A. Control Model Definition
Due to the separation principle, the controller designed in this paper can rely on the combination of the KF and a staticstate feedback
where R i ∈ R 2×2 . This is the classic approach used also for linear quadratic Gaussian (LQG) control. In the following, the feedback matrix R i is explicitly designed to consider possible T s fluctuations, while assuring mean square stability with a minimum variance of the clock state. If the variable S i (k) denotes the covariance matrix of the controlled clock state when the kth Sync frame is received, the closed-loop system is referred to as mean square stable if lim k→∞ S i (k) = S i < +∞ [29] . Clearly, the smaller the value of S i , the better the controller. Therefore, S i can be regarded as a metric of QoC. Fig. 2 shows the whole closed-loop system, along with the corresponding uncertainty sources as inputs. Notice that in (18) the control action is applied after a unitary delay element corresponding to one synchronization period. This is due to the fact that the processing time spent to estimate the clock state in the KF is not negligible. Therefore, to limit the jitter in applying the control, u i (k) is kept constant and the new value is applied only when the next synchronization event occurs, as customary in digital control systems. It is worth emphasizing that the delayed application of the control input comes in handy if a two-step clock synchronization model is used.
To analyze performance and stability of the closed-loop system, an additional serious practical issue should be considered when the controller is implemented. In fact, the duration T s i k of the kth synchronization period on node i is actually measured by the i th clock itself. Therefore, the corresponding measurement result T i s i k (where, again, the superscript i represents the timescale on which the time interval is measured) depends not only on the intrinsic variability of T s , but also on the limited resolution and the frequency fluctuations of the local clock. Accordingly, we can write that T i
, where f i is the nominal frequency of the i th clock and d i k in the integer number of ticks counted in the kth synchronization period. While the value of d i k should be ideally equal to a given value M, in practice it generally lies within the tolerance interval M = [M l , M u ], whose upper and lower bounds depend on the chosen synchronization protocol. If d i k is out of this interval, most probably the kth Sync frame is lost. This implies that the controller cannot be updated and the previous value must be kept for a further synchronization interval. As a consequence of the assumptions above, if the general clock model (1) is discretized with a period 1 f i , the system and input matrices of the discretized clock error model result from expressions similar to (4) and (5):
Notice thatF i andG i are time invariant. However, the variability of the Sync frame interarrival times causes the closed-loop system switching between different dynamics that depend on the actual value of d i k . Thus, the closed-loop dynamic can be modeled as
is the same as in (3), γ i (k) is the column vector containing the estimation fluctuations associated withê i (k) at the output of the KF
are the closed system matrices depending on whether a Sync frame is received or not, I 2 and 0 2 are the 2 × 2 identity and all-zeros matrix, respectively, and
represent the input matrices of the closed-loop system. Observe that matrixG (21) is built as a result of the computation of the forced response of the system when a constant control input u i (k) is applied. Also,G M is simply the same asG
B. Controller Analysis and Synthesis
The stochastic process generating the synchronization period is considered here to be stationary in time. This assumption can be removed at the cost of a more complicated model, which will be the subject of a future work. Under the assumption of stationarity, the probability μ d i that the measured synchronization period on node i occurs after d i ∈ M ticks is independent of k. Therefore, the probability of not receiving any Sync frame on node i is simply given by If S i (k) is the covariance matrix of z i (k), its value after the (k + 1)th synchronization period is given by (23) where H i (k) is the total covariance matrix associated to
i on the main diagonal. If operator ⊗ denotes the Kronecker product, by using the operator's properties, (23) can be rewritten as
where operator vec(·) builds a vector by stacking the columns of the input matrix and
Observe that the discrete-time time-invariant system in (24) can be regarded as a standard linear system with state (23) tends toward a finite value for k → +∞, which occurs when
vec(S i (k)) and input vec(H (k)). If the positive-definite covariance matrix H (k) is bounded (i.e., H (k) ≤ H ), the mean square stability is guaranteed if and only if S i (k) in
where λ j ( A i ) denotes the j th eigenvalue of A i . In such conditions, system (24) is asymptotically stable and it admits a steady-state solution given by
In conclusion, if the static feedback controller (18) is used to discipline the clocks (for instance, using an LQG synthesis) and condition (26) is satisfied, the closed-loop system is stable even under the effect of synchronization period fluctuations within M. Moreover, the trace or the determinant of S i can be used to determine the expected variance of the controlled variables. Clearly, the same expression can be used to synthesize the controller. To this purpose, it is sufficient to find the elements of the static feedback matrix R i such that the determinant of S i is minimized and constraint (26) is met. Although, in principle, R i could be a row vector since the system is completely controllable by driving the clock rate, in practice the optimization problem above may hardly assure good performances. In fact, it is better to control both time and clock rate. However, in this case R i turns to be a square bi-dimensional matrix (i.e., with up to four parameters), which makes the corresponding nonlinear optimization problem very hard to solve. Therefore, we finally decided to set R i as a diagonal matrix (with two design parameters only), because this approach assures a good tradeoff between flexibility and optimization complexity. As a consequence, it can be shown that the elements of the main diagonal have to be drawn from the open compact set (− f i /M, 0). In this way, at first all the values satisfying (26) are found numerically. Then, the solution minimizing (27) is chosen. It is worth noticing that the increment in complexity required to solve the full four-variable optimization problem is probably excessive with respect to the performance improvement achievable using just a diagonal matrix.
VI. PERFORMANCE EVALUATION: A CASE STUDY
To evaluate the performances of the proposed clock synchronization strategy in a realistic scenario, several simulations have been performed in Matlab assuming to run PTCP in a PROFINET IO network. In the following, at first the main features of PROFINET IO are shortly recalled. Then, the values of the various simulation parameters are introduced and justified. Finally, the corresponding results are reported.
A. Protocol Overview
PROFINET IO is an industrial automation protocol for exchanging data between IO-controllers (i.e., intelligent devices running automation control programs) and IO-devices (i.e., sensors, actuators and IO modules). PROFINET IO is defined in the communication profile specifications CP 3/4, CP 3/5 and CP 3/6 of the Standard IEC 61784-2 [30] . PROFINET IO comprises several combinations of features and parameters. The features considered for the simulations of this paper refer to the so-called PROFINET IO Conformance Class C (CC-C). In this class, process/field data are exchanged cyclically between IO-controller and IO-devices using the RT_Class 3 communication protocol [also known as isochronous real-time (IRT)]. Usually, such a high-performance protocol is used when the application (e.g., motion control) needs a cycle time in the range of 31.25 μs to 4 ms with extremely low jitter (±1 μs). The RT_Class 3 devices require synchronous communication based on an ad-hoc time division medium access (TDMA) policy. In particular, medium access is periodic and every cycle time is divided into two main phases.
1) A RESERVED phase, in which the whole network infrastructure is used to transmit IRT frames only. Other Transport Control Protocol/Internet Protocol (TCP/IP) or PROFINET IO messages just wait in the buffers of the network switches, while the IRT timecritical frames are scheduled a priori and routed over predefined paths. 2) An OPEN or GREEN phase that is used for non-timecritical PROFINET and Ethernet data traffic, including usual IP/TCP frames. The OPEN phase is always present in a cycle. During this phase, the frames are transmitted and routed according to their Ethernet priority (as specified in the Standard IEEE 802.1Q [31] ). Clock synchronization in PROFINET IO IRT networks can be advantageously exploited for automation applications, thus enabling high-end performance [32] . The topology of a PROFINET IO IRT network can be chosen in order to fulfill specific availability and maintenance needs. Typically, the physical network topology is a ring in order to exploit redundancy of the double path, but logically it still behaves as a line (i.e., a chain of devices).
B. Simulation Parameters
The PROFINET IO IRT implementation includes PTCP for network-level clock synchronization. The simulation parameters can be roughly divided into two groups, i.e., those depending on the clock model and those related to PTCP. The former ones depend on the typical features of Ethernet clocks and are listed in Table I in accordance with the definitions reported in Sections III and IV (except the initial clock state values which are chosen just randomly). Table II , instead, shows the values of the chosen PTCP and communication parameters, as well as the corresponding uncertainty contributions. Some values are based on PROFINET IO specifications, while others are obtained experimentally, as described in [33] , [34] . In Table II , with parameters α = 1 and β = 3. The nominal clock synchronization interval T s is set equal to the PROFINET default value (i.e., 30 ms). Synchronization interval jitter is in the order of some tens of μs and exhibits a triangular distribution [33] , [34] . The bridge delay values greatly depend on data traffic as well as on frame size. In PROFINET IO IRT, the Sync frames are forwarded (with a cut-through approach) with the highest priority only in the green phase. During the red (reserved) phase instead, the Sync frames wait inside the switch buffer till the beginning of the next green phase. On the basis of these assumptions, by extrapolating the experimental results reported in [33] , [34] , a Beta distribution is used to model the bridge delays. Observe that, while the minimum value a = 10 μs is constant (since it depends only on the time spent in crossing a switch when a Sync frame is forwarded immediately), different upper bounds are considered in simulations (i.e., b = 50, 100, 250 μs) to describe the effect of a growing amount of buffered data. PROFINET IO relies on burst message exchange for line delay estimation. In particular, bursts of five Delay_Req/Delay_Res frames are issued with a nominal period T dq = 8 s. The time distance between pairs of consecutive Delay_Req frames of the same burst is 200 ms. The latency T d between a Delay_Req frame and the corresponding Delay_Res frame is uniformly distributed between 400 μs and 800 μs. Thus, each line delay value is obtained from the average of seven consecutive measurement results given by (11) (i.e., five belonging to the current burst plus two taken from the previous burst). All line delay values are assumed to be uniformly distributed in [1602, 1608] ns, as they include both cable and physical layer latencies. Finally, the hardware timestamping uncertainty contributions reported at the bottom of Table II are assumed to have approximately a trapezoidal distribution due to the superimposition of clock resolution and oscillator phase noise. This kind of distribution was obtained through Monte Carlo simulations using the individual uncertainty contributions reported in [13] . On the receiver side, slightly larger systematic delays and jitter are introduced by clock signal recovery circuitry. Finally, the nominal probability of losing a packet has been evenly fixed to 0.2%. However, this probability grows with the distance of the node from the master clock, as discussed in Section V.
C. Simulation Results
The simulation results reported in this section refer to a linear network topology with at most 30 nodes between the synchronization master and the end of the line, in accordance with PROFINET IO specifications. Every Monte Carlo simulated experiment consists of 30 runs. Each test lasts 60 s. The root mean square estimation errors (RMSEs) of time and rate offsets of six controlled network nodes (i.e., nodes 1, 2, 5, 10, 20, and 30) with respect to the master are plotted as a function of time in Fig. 3 . All bridge delays are distributed as B(10μs, 50μs). In particular, Fig. 3(a) and (b) shows the results for a controller that stabilizes each clock node and minimizes the closed-loop maximum eigenvalue, hence ensuring the fastest convergence. As a consequence, the transient is very short (less than 3 s), but the system tends to amplify the uncertainties associated with master timestamp propagation. On the other hand, Fig. 3(c) and (d) reports the same types of errors obtained with a controller that minimizes the determinant of the steady-state output covariance S i . In this way, a filtered, but slower, response is obtained. Observe that the RMSE values of the time offsets after reaching the steady state are below 200 ns, even at node 30. This value is much smaller the typical time offset of 1 μs required by top applications in industrial automation [1] , [4] and represented by a dotted horizontal line in Fig. 3(a) and (b) . Clearly, the clock state estimation accuracy is strongly influenced by line delay compensation, which is quite poor during the transient phase. However, accuracy improves every 8 s, i.e., anytime a new burst of Delay_Req/Delay_Res frames is exchanged between pairs of consecutive nodes. Once the steady-state is reached, the line delay estimation uncertainty is negligible. Consider that (26) is a necessary and sufficient condition for mean square stability. Therefore, a static controller that does not explicitly take into account the fluctuations of the synchronization period or the possibility of losing some packets may exhibit poor performance or even instability. In order to highlight this issue, some simulation results performed in the same conditions as in Fig. 3 , but based on a PI controller adjusting the time increments of each clock are shown in Fig. 4 . Also in this case the dotted horizontal line represents the time offset of 1 μs typically required in industrial automation. The case of PI controllers is perfectly suitable for a comparison with the proposed approach, because such controllers are simple and commonly employed in servo-clocks [35] . In particular, the chosen PI-based servo-clock implementation is similar to the solution described in [36] , but relies on a different set of control coefficients (i.e., K p = 48.7805 and K I = 30.5) that ensure a very short transient if no synchronization period jitter is present. In Fig. 4 , the RMSE patterns of the time offsets as a function of time are shown. Quite interestingly, the steady-state synchronization accuracy on the first nodes of the chain are stable and just slightly worse than the results shown in Fig. 3(a) and (b) . However, the clocks that are farther from the master (e.g., nodes 20 and 30) exhibit increasingly large time offset fluctuations due to the accumulated jitter which is not properly handled by the PI controller. Moreover, even a slight change of the PI coefficients or a small jitter increment easily leads to instability, as it will be shown in the following. Fig. 5 shows the maximum time offset estimation errors associated with each KF after 60 s. The error patterns are shown as a function of the position of each node in the line (the master being node 0) and for different distributions of the bridge delays. The distance between pairs of consecutive nodes is assumed to be the same in all cases.
As expected, the estimation uncertainty grows as the distance from the master clock increases. This effect, in the worst case, is more evident when the bridge delays fluctuations are larger. Nevertheless, the uncertainty growth rate is generally compatible with PROFINET IO IRT requirements [32] . It is interesting to compare these results with the accuracy of the controlled clock with respect to the reference master time. Indeed, the worst-case accuracy of the "fastest convergence" controller [ Fig. 6(a) ] and of the "minimum closed-loop covariance" [ Fig. 6(b) ] controller are tightly related to the accuracy of the estimation process.
The situation is completely different when the PI-based servo-clock is used. In this case, if the bridge delay fluctuations increase, the maximum time offset oscillations clearly diverge with the position of the nodes in the chain of clocks, because the system is no longer stable.
VII. CONCLUSION
An accurate servo-clock model for chains of TCs is presented in this paper. The model explicitly considers noises and uncertainty contributions that usually affect industrial networks needing accurate clock synchronization. In addition, the model also considers some practical implementation issues. The main novelty of the proposed approach is its ability to tackle the detrimental effect of synchronization periods fluctuations on servo-clock performance. In this respect, a necessary and sufficient condition for mean square stability is given. With this condition, both a stability analysis and a controller design criterion are derived. Several simulations have been carried out in a case study based on the features of PROFINET IO devices to prove the effectiveness of the proposed approach in assuring a stable behavior over long chains of clocks, which cannot be guaranteed with other standard approaches. In particular, two solutions with different performances are presented: one exhibiting the fastest convergence time and the other minimizing the determinant of the closed-loop system state covariance matrix.
