Abstract-The precision of synchronization algorithms based on the theory of pulse-coupled oscillators is evaluated on FPGAbased radios for the first time. Measurements show that such algorithms can reach precision in the low microsecond range when being implemented in the physical layer. Furthermore, we propose an algorithm extension accounting for phase rate deviations of the hardware and show that an improved precision below one microsecond is possible with this extension in the given setup. The resulting algorithm can thus be applied in ad hoc wireless systems for fully distributed synchronization of transmission slots or sleep cycles, in particular, if centralized synchronization is impossible.
I. INTRODUCTION AND MOTIVATION
The mathematical modeling of pulse-coupled biological oscillators, as proposed in [1] inspired by [2] , offers a fully decentralized and scalable approach for time synchronization. There is a broad spectrum of work on pulse-coupled oscillators in physics, biology, neuroscience, and other disciplines (see, e.g., [3] - [11] and references therein). The communications engineering community has been interested to transfer these results to self-organizing synchronization of nodes in wireless systems [12] for purposes such as slot and frame synchronization, scheduling of cooperative transmissions and sleep cycles, and distributed sensing. A one-to-one transfer is, however, infeasible due to the differences between biological and radio communications. Several extensions and modifications are required with respect to delays, noise, multihop communications, and synchronization words, to mention a few (see [12] - [20] and references therein).
Despite the conceptional and theoretical advances in the design of pulse-coupled oscillator synchronization algorithms for wireless systems, real-world performance studies and experimental proofs of concepts are largely missing. There only exist a few implementations on low-cost wireless sensor platforms (see [15] , [21] , [22] ), whose results are of interest, but their synchronization precision is limited by hardware capabilities. For example, the "fifty percentile group spread" in a system with 24 MicaZ motes reported in [15] is in the order of 100 μs, which is insufficient for certain applications, such as slot and frame synchronization. This paper intends to advance this direction of research. In particular, we pose the question as to which synchronization precision can algorithms based on pulse-coupled oscillators achieve in practice when being integrated into the physical layer on a programmable radio platform. Furthermore, as a result of our experimental research, we gain further insight into the behavior of pulse-coupling in real-world wireless scenarios, and thus propose an extension to the theory, which intends to corrects phase rate deviations.
Our main contributions are as follows:
• Providing a proof of concept by implementing three pulse-coupled oscillator algorithms on fieldprogrammable gate array (FPGA)-based programmable radio boards
• Comparing the synchronization precision of these algorithms by measurements
• Proposing an extension to the synchronization algorithms accounting for phase rate deviations
• Showing by experiments that precisions below one μs can be achieved To the best of our knowledge, this is the first lower-layer implementation and real-world performance study of recent pulse-coupled oscillator algorithms on programmable radios.
II. SYNCHRONIZATION ALGORITHMS
We evaluate the synchronization precision of the following recently proposed synchronization algorithms:
• Synchronization by Pagliari & Scaglione (PS) [22] , [23] , • Synchronization with inhibitory coupling and selfadjustment (SISA) [24] and
• Synchronization with inhibitory and excitatory coupling with stochastic pulse emission (IES) [11] . We also propose and evaluate a modified version of the IES algorithm that applies phase rate correction (IES * ). The objective of all four algorithms is to synchronize the phases of oscillators.
The general procedure is as follows: The oscillator's phase φ is increased from zero to one. When φ reaches one, φ is reset and a pulse is emitted, either always or with probability p < 1 depending on the algorithm. When receiving a pulse from another oscillator, an oscillator adjusts its own phase according to an update function H(φ).
The absolute time is called t. The period τ ij denotes the delay between oscillator i and oscillator j, i.e., the time it takes from the start of a pulse at i until it is processed at j. Let τ min , τ max , and τ denote the minimum, maximum, and mean values of all delays, respectively. Furthermore, φ(t) is an oscillator's phase at time t and φ(t + ) its phase infinitely short after t. The term ν i is the phase rate deviation of oscillator i, i.e., the speed of oscillator i compared to a reference oscillator; ν max is the maximum phase rate deviation of all oscillators. Let h(t) denote the function which maps durations t in seconds to the corresponding phase, i.e. h(t) = t tc , where t c is the cycle length in seconds. Figure 1 specifies the four algorithms. We use
for PS with parameters a 0 = 1 and a 1 = exp(1) (strong coupling). These parameters for PS are chosen as they optimize the convergence speed. Choosing different parameters does not influence the achieved precision. For SISA we apply
with α = 0.5 which is also applied in [24] . IES uses
For both IES and IES
* we apply p = 1 2 as the sending probability.
III. IMPLEMENTATION ON PROGRAMMABLE RADIO
We implement all synchronization algorithms on WARP boards [25] , which are FPGA-programmable radio boards. A custom single-carrier physical layer is programmed with 5 MHz bandwidth and binary phase shift keying (BPSK). Boards operate at 2.4 GHz and use a peak transmit power of 22 dBm. The overall structure of the transceiver is shown in Figure 2 (a). All components are implemented directly on the FPGA.
On the transmitter side, the packetizer and modulator build the packet after it receives a trigger signal from the synchronization logic. A modulated packet is fed into an interpolator and upconverter, and finally transmitted over the air. As we cannot send infinitely short pulses, as often assumed in theory, we send short packets instead. These packets have a length of 12 bytes, where the first 8 bytes are used for setting receiving gains of the hardware (agc) and for mitigating carrier frequency offsets (cfo). The remaining 4 bytes represent a synchronization word consisting of pseudorandom bytes. The transmit duration for a packet is 19.2 μs. 1) Whenever φ(t) = 1, the oscillator sends a pulse.
2) The refractory interval is
3) Upon reception of a pulse at time t * :
(a) PS 1) Whenever φ(t) = 1, the oscillator adjusts its phase to φ(t + ) = H SISA (1) and sends a pulse.
(b) SISA 1) Whenever φ(t) = 1, the oscillator sends a pulse with probability p < 1.
2) The refractory interval is
(c) IES 1) Whenever φ(t) = 1 and no pulse has been received within the last τ − τ min seconds, the oscillator sends a pulse with probability p < 1.
3) Upon reception of a pulse at time t * : We implement a non-data aided algorithm [26] for removing carrier frequency offsets. To detect the sync word, a correlator implemented as an FIR filter is applied. The synchronization logic on the FPGA consists of implementations of the algorithms discussed above. The oscillator component (Figure 2 (b) ) replicates the oscillator on the board. The main part, generating the phase of the oscillator, is a 22-bit wrap-around counter running at a clock frequency of 40 MHz. Thus, the cycle duration t c , i.e., the time it takes for the counter to increment from 0 to 2 22 − 1, is about 104.86 ms. To get a value between zero and one we reinterpret the output as a fractional number by multiplying with 2 −22 . After a pulse is detected, the new phase H(φ), determined by the synchronization algorithm component (not shown in Fig. 2 (b) ), is forwarded to the oscillator block and an accumulator is used to adjust to the new phase. Note that the accumulator only processes the value on its input if it is enabled, i.e. if a synchronization word has been detected. The output of the accumulator always reflects the current value, independent on whether or not the accumulator is enabled.
Due to manufacturing tolerances, boards exhibit phase rate deviations. These deviations limit the achievable synchronization precision. As a countermeasure we add correction terms c i in IES * . These correction terms are determined for each board individually by manually measuring their phase rates with respect to a reference phase. The terms c i are then stored in the board's memory and applied during synchronization in the following way: at each clock cycle the correction factor c i is accumulated in a dedicated accumulator and the output of the accumulator is then added to the output of the counter. As our clock is running at 40 MHz and since we adapt to the fastest phase rate, c i corresponds to [ν max − ν i ]/40.
Note that phase rates depend on environmental factors, i.e. mainly temperature. The purpose of applying these correction factors is to showcase the influence of phase rate deviations on the achievable synchronization precision. As future work we plan to propose a fully decentralized algorithm that not only synchronizes phases but also phase rates. Note that for all other algorithms, besides IES * , we set c i to zero.
IV. MEASUREMENT RESULTS
Six boards are setup to form a fully-connected network with six nodes. All nodes can generally receive packets of all other nodes, but packets might be lost due to interference from colocated WLANs.
A. Delays τ and Phase Rate Deviations ν
We measure the delays τ ij for various sender-receiver pairs and analyze the overall empirical probability density function (epdf). The epdf is derived by the method of kernel density estimation. Figure 3 (a) shows the epdf of τ based on six sender-receiver pairs and 10 000 transmissions each. Experiments show no significant difference between the senderreceiver pairs: The delay is always between τ min = 21.7 and τ max = 22.2 μs, and the average delay is τ = 21.92 μs; we use these values in the synchronization algorithms. The values presented for τ are accurate to ±25 ns. 
B. Synchronization Precision
The synchronization precision at time t in terms of the maximum phase difference between all nodes, i.e.,
is shown in Figure 4 for n ∈ {2, 4, 6} nodes. Results are based on 100 synchronization runs, where the phases of all nodes are randomly initiated. The x-axes show the cycle number. The cycle duration is 52.43 ms for SISA and 104.86 ms for all other algorithms. The values presented are accurate to ±25 ns.
These measurement results can be interpreted as follows: PS converges very quickly to a synchronization precision of about 21 μs. For SISA, the speed of convergence decreases with increasing n, which is due to the fact that packets lost over the wireless link cause a rapid deterioration of the synchronization precision. For n = 2, for example, one lost packet causes the precision to deteriorate immediately to 1/4 of the cycle length, unless nodes are in refractory. The convergence of IES is slower than that of PS with the given parameters, however, it converges to a mean precision of about 1.5 μs (n = 2), 2 μs (4) and 4 μs (6). IES * achieves a precision of about 200 ns (n = 2), 400 ns (4) and 600 ns (6) .
The fact that PS and SISA synchronize less precisely than IES algorithms in this setup is likely due to two reasons: (i) propagation delays are not considered in those algorithms and (ii) nodes cannot hear other nodes when sending. This result confirms that stochastic communication of synchronization words is an important design feature (see, e.g., [11] ). This feature could in principle also be applied to PS.
V. CONCLUSIONS
Measurement results of pulse-coupled oscillator synchronization implemented on FPGA radios show that the synchronization precision can reach values below one μs. Key factors for reaching this precision are the explicit consideration of propagation and processing delays, the stochastic nature in communications of synchronization words, and a phase rate correction. The latter mitigates precision limitations caused by phase rate deviations of the hardware. 
