Abstract-A novel clock distribution technique, the Cable Automatic sKew Elimination (CAKE) clocking scheme has been developed and presented in this paper. In this scheme, clock pulses are driven into a cable and reflected from the high impedance receiving end. At the driving end, a cake-shaped waveform is seen and with 1/4 of the full pulse amplitude threshold, the output logic pulse width from a comparator carries cable delay information. Using a time-to-digital converter (TDC), the cable delay variation due to temperature change can be monitored and compensated for. The philosophy behind the CAKE clocking scheme is to keep the receiving end as simple as possible while implement extra circuitry in the transmitting end. Another clocking technique based on the same philosophy is the trapezoidal clocking scheme that we developed in our previous work. Demo tests of both the CAKE clocking and the trapezoidal clocking schemes are presented in this paper.
I. INTRODUCTION
LOCK distribution is needed for systems with multiple modules requiring synchronized operation. For example, in systems with large number of digitization channels, often the timing skews of the clock signals driving the digitization devices in different modules are to be controlled within a small range. There exist solutions for distributing clocks over large distances such as 100 m to 1 km. [1] [2] [3] In daily design jobs for distributing clocks within the "final a few meters", however, schemes used today are essentially open-loop ones, without monitoring and compensating the temperature variation of the cable delays.
In close-loop common timing reference distribution schemes, in order to monitor the variation of the cable delay and subsequently performing skew compensation functions, often signals are allowed to propagate in both directions in a medium. The signal carrying medium could be an optical fiber or a metal cable which could be a single-ended coaxial one or a differential one. The scheme can be a point-to-point one or a multi-drop one. The signals propagating in both directions can either be driven from both ends or driven from sending end while reflected from the receiving end and the signals can also be analog ramps or digital pulses.
Clocking schemes with close-loop skew control [3] [4] [5] are available for many years but the complexity and extra circuitry, especially at the clock receiving module discouraged their wider applications.
In our previous work [6] , a simple clocking technique called the trapezoidal clocking scheme was developed based on the addition of an analog ramping voltage and its reflected version. The philosophy behind it is to implementing extra circuits in the driving side while keeping the receiving side simple. Based on the same philosophy, a purely digital technique called the Cable Automatic sKew Elimination (CAKE) clocking scheme is developed and described in this paper. Demo tests of both the CAKE clocking and the trapezoidal clocking schemes are presented in this paper.
II. THE CABLE AUTOMATIC SKEW ELIMINATION CLOCKING SCHEME AND ITS DEMO TEST
The goal of the CAKE clocking is to generate isochronous clock pulses at the receiving ends for all channels by adjusting clock phases at the driving end, based on information seen at the driving end only. The CAKE clocking scheme uses an impedance matched driver to send clock pulses with width w into a cable with timing delay d while leaves the receiving end high impedance as shown in Fig. 1(a) . In our tests discussed in this paper, the clock frequency is 5 MHz and the clock pulse width is 50 ns, i.e., 25% duty cycle.
The clock pulses are reflected from the cable end back to the driving side at which a cake-shaped waveform as shown in Fig. 1(b) is seen as a result of summing voltages of the transmitting and receiving pulses.
A comparator with the threshold setting at the 1/4 of the full amplitude is used at the driving end to generate logic pulses to be fed into a time-to-digital converter (TDC). The widths of the logic pulses seen in the TDC are w + 2 d that provide a continuous monitoring of the cable delay.
From the times of the leading and trailing edges of the cake shaped waveform, the users are able to align the clock pulses at the receiving ends to a desired time value. Consider a clock distribution module driving two cables A and B with time delay d A and d B . If the leading edge times at driver side are t 0A and t 0B , then the leading edge times at the receiving ends are: 
It can be shown that if t 0A and t 0B are adjusted so that the mean times of the leading and trailing edge times of the two cake-shaped waveforms are aligned, then the leading edge times, t 1A and t 1B at the receiving ends will also be aligned:
A demo tester is implemented with an Altera Cyclone III FPGA (EP3C16F256C6) as shown in Fig. 2(a) . For a differential cable, resistors are used to convert low-impedance CMOS logic levels into 100 Ohms impedance LVDS-like pulses. When the receiving end is un-terminated, regular 3R circuit will not function properly. We developed a 5R circuit as shown in Fig. 2(b) that provides both matched impedance and correct receiving threshold. In this case, the two opposite cake-shaped signals pass across each other at 1/4 of the full amplitude.
Phase-lock loops (PLLs) inside the FPGA are used to generate periodic output pulses with phases dynamically adjustable at 97 ps/step. The outputs of the FPGA drive the differential cables with different lengths via 5R circuits.
The PLL phases are first adjusted so that the leading edges are aligned at the driving points as shown in the top portion of Fig. 3(a) . Clearly there is a skew due to difference of the cable lengths at the receiving ends as shown in the reference traces in Fig. 3(b) . Then the PLL phases are adjusted automatically so that the mean times of the leading and trailing edges measured by the TDC blocks are aligned as shown in the lower portion of Fig. 3(a) . This yields alignment of the clocking pulses at the receiving ends of both channels as shown in the lower portion of Fig. 3 (b) .
It should be pointed out that the automatic de-skew ability shown in this demo test is the secondary benefit of the CAKE clocking scheme. In most applications, it is more desirable to continuously monitor the variations of the clock cable delay due to temperature and other factors and record the variations for offline analysis. Sometime, the users may not wish the clock distribution system to make cable delay compensation spontaneously.
However, the automatic cable delay compensation at power up, or between data taking runs in detectors provides extra convenience for most applications.
The phase adjustment granularity of 97 ps/step is a limit due to the PLL structure inside the FPGA. If a finer granularity is demanded, external PLL integrated circuits can be used.
III. LONGER CABLE IN CAKE CLOCKING
The firmware created in this project is designed when the cake-shaped signal is resulted on the sending side. That is when round trip cable delay 2d is smaller or equal to the clock pulse width w. It serves the purpose to provide a simple way to eliminate the distributed clock skew for devices connected by cables in meters range. With nominal delay around 5 ns /meter in various types of coaxial cables the firmware created in this project can work for cables under 5 meters in length when the clock pulse width is 50 ns.
It is a natural demand to extend the clocking scheme to the applications with longer cables. To enable this possibility, a necessary condition is that at least a reflected pulse edge is seen for any cable length. In the project the duty cycle of the clock signal was set to 25% rather than 50%. If the duty cycle is set to 50% when the delay 2d = T-w, where T is the period of the clock signal, the resulting signal on the sending side will be constant. The cable delay information cannot be extracted from the signal in the sending module. When the clock cycle is set to 25%, it guarantees at least one leading or trailing edge from the incoming and reflected signal can be identified. Fig. 4 shows the simulation of the signals on the sending side for clock signal pulse width equal to 25% of the period. When 2d is greater than the clock pulse width and smaller than T-w, where T is the full cycle, the signal on the sending side is not cake-shaped as shown in Fig. 4(b) . But in this case both the leading and tailing edges of the incoming and reflected signals can still be identified.
The cable time delay information can be to extract from the signal on the sending side. Fig. 4(c) shows the waveform when T > 2d > T-w. In this case, the leading edge of the reflected signal and trailing edge of the incoming signal can be identified from cakeshaped signal as shown. Therefore the method by extracting the cable delay information from the sending module works when the clock delay 2d is greater than the pulse width.
IV. DEMO TEST OF THE TRAPEZOIDAL CLOCKING SCHEME
In our previous work, we developed the trapezoidal clocking scheme in which a trapezoid pulse is fed into a multitap cable. If the cable is lossless, the voltage measured at any tap is the sum of transmitting and reflecting signals and will have an attractive feature, i.e., isochronous zero-crossing. The timing skew due to cable delay is compensated for as if in the analog mean timers [7] [8] but naturally in this case. Oscilloscope traces of trapezoidal clock pulses inside a coaxial cable assembly, 4 ns per segment, are shown in Fig. 5 .
We have built a simple demo tester with daughter cards attached to FPGA evaluation cards as shown in Fig. 6(a) . An integrator in the right-most daughter card translates logic pulses from the FPGA into a trapezoid waveform and feeds the waveform to the other modules.
Timing skews of the comparator outputs on different daughter cards are measured and plotted in Fig. 6(b) . As expected, the timing skews are mostly eliminated when the cable is un-terminated.
As we pointed out in our previous work, due to loss of the cable, the reflected signal is smaller than the transmitting signal which results in residual timing skews when the cable length is long. For practical applications, when a few modules are daisy chained together using cable segments with total length < 6 ns, clock skews at the input of the module can be controlled within 50 ps. When high quality cables with low loss are used, even smaller skews can be anticipated.
V. CONCLUSIONS
The CAKE clocking scheme has been validated in our tests. The cable length variations can be monitored with the signals collected at the clock fan-out module. From the cable lengths measured constantly at the sending end, it is possible to adjust clock phase to make compensation of the cable delay variation.
Trapezoidal clock pulses in open-ended lossless cable produce isochronal crossings at the taps of the cable. This phenomenon can be used for clock synchronization with timing skews cancelled naturally. Variations of trapezoidal clocking are studied for cable loss compensation. Timing skew cancellation well below 50 ps has been achieved with typical cables.
The CAKE clocking scheme is a purely digital one and it is not sensitive to the signal attenuation in the cable but the trapezoidal clocking scheme requires analog circuits and the cable loss must be accounted for during the design. On the other hand, the trapezoidal clocking scheme supports daisy chained topology, i.e., one output port from the clock fan-out module is able to drive several receiving modules, while the CAKE clocking scheme supports point-to-point connections only.
The two clocking schemes fit different applications with different emphases.
Roughly speaking, the trapezoidal clocking is suitable for distributing an isochronous clock within a crate while CAKE scheme is suitable for clocking between several crates and racks.
