# An FPGA Wave Union TDC for Time-of-Flight Applications

# Jinyuan Wu

Abstract— An 18-channel time-of-flight (TOF) grade time-todigit converter (TDC) has been implemented in a low cost FPGA device. The TDC has the following unique features. (1) The time recording structures of the TDC is based on the "wave union TDC" we developed in our previous work. A leading edge of the input hit launches a bit pattern, or wave union into the delay chain-register array structure which yields two usable measurements. The two measurements effectively sub-divide timing bins for each other especially the "ultra-wide bins" caused by the FPGA logic array block (LAB) structure and improves measurement precision both in terms of maximum bin width and RMS resolution. A coarser measurement on input signal trailing edge is also provided for time-over-threshold (TOT) applications. (2) The TDC supports advanced timing reference distribution schemes that are superior to conventional common start/stop schemes. The TDC has 16 regular measurement channels plus two channels for timing reference. The timing reference is established with multiple measurements rather than single shot common start/stop. An advanced scheme, the mean-timing approach even eliminates needs of high quality timing distribution media. (3) The ASIC-like encapsulation of the FPGA TDC significantly shorten the learning curve for potential users while maintain certain flexibility for various applications. Necessary digital post-processing functions including semicontinuous automatic calibration, data buffer, data link jam prevention logic etc. are integrated into the firmware to provide a turn-key solution for users.

## Index Terms-Front End Electronics, TDC, FPGA Firmware

## I. INTRODUCTION

CARRY chain structure existing in FPGA families can be used in time-to-digital conversion (TDC) purposes[1-7]. A special feature of the FPGA TDC is its large differential nonlinearity (DNL) which is represented as large variation of apparent width of each TDC bin. The most significant origin of DNL is the logic array block (LAB) structure. When the input signal in the carry chain passes across the LAB boundaries (and also the half-LAB boundaries in some FPGA families), extra delays added cause periodic "ultra-wide bins".

In our previous work [7], an approach called the "wave union TDC" is developed to sub-divide the ultra-wide bins and to improve measurement resolution. The key part in the wave

The author is with Fermi National Accelerator Laboratory, Batavia, IL 60510 USA (phone: 630-840-8911; fax: 630-840-2950; e-mail: jywu168@ fnal.gov).

union TDC is the "wave union launcher" which creates a pulse train or "wave union" with several 0-to-1 or 1-to-0 logic transitions for each input hit. The wave union is fed into the TDC delay chain/register array structure, making multiple measurements.



Fig. 1. Block diagram of the 18 channel wave union FPGA TDC

This document describes an 18-channel Wave Union TDC implemented in a low cost Altera Cyclone II FPGA device (EP2C8T144C6). The TDC has the following unique features:

(1) The RMS resolution between leading edges of two input channels is 25ps, taking advantage of multiple measurements of the wave union. A coarser measurement (without wave union) on input signal trailing edge is also provided for time-over-threshold (TOT) applications.

(2) The TDC supports advanced timing reference distribution schemes that are superior to conventional common start/stop schemes. The TDC has 16 regular measurement channels plus two channels for timing reference. The timing reference is an average of multiple measurements rather than single shot common start/stop. An advanced scheme, the mean-timing approach even eliminates needs of high quality timing distribution media.

(3) The ASIC-like encapsulation of the FPGA TDC significantly shorten the learning curve for potential users while maintain certain flexibility for various applications. Necessary digital post-processing functions including semicontinuous automatic calibration, data buffer, data link jam prevention logic etc. are integrated into the firmware to

Manuscript received November 15, 2009. This work was supported in part by Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the United States Department of Energy and University of Chicago's Fermilab Strategic Collaborative Initiative.

provide a turn-key solution for users.

## II. THE WAVE UNION LAUNCHER

A wave union scheme described in Reference [7], the "wave union launcher A", is used in this TDC firmware. The wave union launcher A belongs to the Finite Step Response (FSR) type. It generates a pulse train with three logic transitions of which two are encoded.

The wave union launcher in each channel of the TDC is implemented in a Logic Array Block with 12 logic elements. It feeds into another of 48 cells in a 64-cell carry chain/register array. (The first 4 bits in the array are used for diagnosis purpose.) The bit pattern of the array is shown in Fig. 2.

| EEDCRA0076E4  | -                                       | 0765422105  |                                         | 22105500300                             | 76542210  |
|---------------|-----------------------------------------|-------------|-----------------------------------------|-----------------------------------------|-----------|
| PEDCBA98/054  | SZIUPEDCBA9                             | 876545210F  | LCCCA96/054                             | SZIUPEDCBA98                            | 76543210  |
| wave union T  | DC WUTDCU9a                             | FPGA JYWU   | 1680inal.go                             |                                         | JIG825GX  |
| 111111110000  | 00011111111                             | 1111111111  | 111111111111                            | 1111111111111                           | 11111111  |
| 111111110000  | 00011111111                             | 1111111111  | 111111111111                            | 1111111111111                           | 11111111  |
| 001000000000  | 00001111111                             | 0000111111  | 111111111111                            | 11111111111111                          | 11111111  |
| 010000000000  | 00001111111                             | 0000111111  | 111111111111                            | 11111111111111                          | 11111111  |
| 011000000000  | 00000000000                             | 0000000000  | 00000000000                             | 0000000000000000                        | 00000000  |
| 011000000000  | 00000000000                             | 0000000000  | 000000000000                            | 00000000000000                          | 00000000  |
| 101111110000  | 00011111111                             | 1000000000  | 00000000000                             | 0000000000000                           | 00000000  |
| 110111110000  | 00011111111                             | 1000000000  | 00000000000                             | 0000000000000                           | 00000000  |
| 111111110000  | 00011111111                             | 1111111111  | 111111111111                            | 1111111111111                           | 11111111  |
| 111111110000  | 00011111111                             | 1111111111  | 111111111111                            | 1111111111111                           | 11111111  |
| 001000000000  | 000000000000                            | 0111111000  | 00111111111                             | 1111111111111                           | 11111111  |
| 010000000000  | 000000000000                            | 0111111000  | 00111111111                             | 1111111111111                           | 11111111  |
| 011000000000  | 0000000000000                           | 0000000000  | 0000000000000                           | 00000000000000                          | 00000000  |
| 0110000000000 | 000000000000000000000000000000000000000 | 00000000000 | 000000000000000000000000000000000000000 | 000000000000000000000000000000000000000 | 00000000  |
| 101111110000  | 000111111111                            | 11111111111 | 11111000000                             | 000000000000000                         | 00000000  |
| 110111110000  | 00011111111                             | 11111111111 | 111110000000                            | 00000000000000                          | 00000000  |
| 110111110000  | 00011111111                             | 0765420107  |                                         | 000000000000000000000000000000000000000 | 26542010  |
| FEDCBA98/654  | 3210FEDCBA9                             | 876543210   | EDCBA98/654                             | 3210FEDCBA98                            | /6543210  |
| Wave Union T  | DC WUTDC09a                             | FPGA jywu   | 168@fnal.go                             | v (                                     | 01G825GXX |

Fig. 2. Bit pattern of wave union launcher response to input pulses.

Each line in the bit pattern is a snap shot of the register array driven by a 387.5 MHz clock signal, which is synthesized from a 100 MHz external clock by multiplying/dividing a factor of 31/8 using the phase-lockloop (PLL) circuit inside FPGA. The frequency is chosen for even relative timing coverage of the test signals generated in other part of the FPGA.

When the input is low, the 64-bit array stays its initial state as shown in the first two lines. A bit pattern of 0's and 1's is formed in the wave union shown from the 4th column to the 16th column in each line. The remaining 48 bits down in the carry chain, i.e., the columns 17 to 64 are held to 1.

Between the second and third clock cycles, the input becomes high and the wave union is launched in the carry chain. The wave union is captured by the clock edge as shown in the third line. The position of the wave union represents the arrival time of the input signal. The earlier the arrival time is the further right the wave union is captured.

When the wave union is captured, the register array clock is disabled for one or two clock cycles. Therefore, the wave union pattern in the fourth line remains unchanged. This feature guarantees the wave union pattern stay in the register array for two clock cycles, allowing the encoder in the later stage to operate with a clock of half frequency (193.75 MHz) to simplify the design and reduce power consumption.

When the input stays high, the entire array is all 0's as shown in the fifth and sixth lines.

When the input returns to low, the wave union launcher and the carry chain recover to the initial state. The recovery process is captured as shown in the seventh line. A 0-to-1 transition in the 48-bit carry chain represents the time of the input falling edge. Multiple transitions exist only in the wave union launched by the input rising edge. Only a single measurement is available for the falling edge as in regular non-wave union TDC schemes. Therefore a coarser timing resolution for falling edge is anticipated.

The falling edges also cause the register array to be disabled for one or sometime two clock cycles. When the register array is enabled again, the initial condition is shown in ninth and tenth lines.

Another pulse is captured as shown in the remaining six lines. The two pulses shown in this example are 10 ns wide and separated by 10 ns.

## III. THE ENCODER BLOCK

As mentioned earlier, the register array clock is disabled for one or two clock cycles to allow the encoder in the later stage to operate with a clock of half frequency in this design. If necessary, the encoder can certainly be designed to operate at the full frequency of 387.5 MHz. When the encoder is operating at the same frequency as the register array, even finer double pulse resolution is achievable.

However, in the FPGA TDC, the register array is driven at a frequency as high as possible to shorten the length of the array. It is relatively difficult to drive the encoders at the same high frequency, especially when integrating more channels into a single device, since the encoders usually need more layers of logic.

In this work, the encoder is designed as a pipeline receiving bit pattern at 387.5 MHz and outputting time values at 193.75 MHz. Disabling register array will degrade the high rate performance. But fortunately, the double pulse resolution of the current design meets or exceeds requirements of many high rate applications.

Another design detail that should be mentioned here is that the encoder must be "bubble proof". In the ideal cases, the 0-1 transitions recorded by the register array are clean thermometer codes, like 000001111. However, "bubbles" at the transition edges like 000010111 may happen due to uneven propagation delays in the FPGA structure. The encoder should be designed to output a reasonable value when the transition edge bubbles occur.

The outputs of the encoder buffer are 6-bit raw time values corresponding to the 0-to-1 transition positions with possible values up to 63 for falling input edge. For rising input edges, the outputs are 7-bit raw time values which are sum of the positions of the two 1-to-0 transitions with possible values up to 127. In addition to the raw time value, a data valid signal and a rising/falling edge indicator are also output to the hit buffer and automatic calibration block.

# IV. THE HIT BUFFER & AUTOMATIC CALIBRATION BLOCK

An internal dual port memory block, M4K, containing 4608

bits is allocated to each channel for temporary hit data storage and automatic calibration look-up table (LUT). Each memory block is partitioned into three areas with 1152, 2304 and 1152 bits each for storing hits, rising edge look-up table (LUTR) and falling edge look-up table LUTF, respectively.

The data storage area is organized as  $64 \times 18$ -bit words. These 64 words are further divided into 8 blocks with 8 words each. In each time slice of 2.64 µs, one of the blocks is addressed for writing. During this period, up to 8 hits per channel can be stored into the buffer. The jam prevention logic inhibits further writing to the memory block if more than 8 hits exist in each channel during the 2.64 µs time slice, which should be sufficient for many high rate applications. Each 18-bit word stores a hit with 7 bits raw TDC time, one bit rising/falling edge indicator and 10 bits coarse time count. The hit writing operation is done via the port A of the dual port memory.

The remaining operations are performed via the port B of the memory in periods of 5.28  $\mu$ s called time frames. In each time frame, which is exactly two time slice long, two blocks storing up to 16 hits total are addressed and read out in sequence. The raw TDC bits of the hit words are used to address calibration look-up tables LUTR or LUTF depending on if the hit represents a rising edge or a falling edge. The output of the look-up table is a 9-bit fine time value. The fine time value is concatenated with the coarse time count to form a 20-bit hit time which are piped out to the output buffer block channel by channel. After finishing reading out the hit data, the memory blocks are cleared to 0, preparing for storing new hit data.

The propagation delay of a delay cell depends on temperature and power supply voltage. In ASIC TDC it is possible to compensate the delay variation using analog method, i.e., to generate a control voltage from the phase difference of external crystal oscillator and the internal ring oscillator and to use the control voltage to fine tune the internal cell delays via a negative feedback.

In FPGA TDC, analog compensation is not convenient and digital calibration is more preferable. The principle of the automatic calibration functional block we developed in our work is shown in Fig. 3.



Fig. 3. The automatic calibration functional block

After power up or system reset, each TDC input is fed with calibration hits to book the DNL histogram and then generate the calibration lookup table LUT. The timing of these hits should have no correlation with the clock signal driving the TDC, so the hits should be generated from an independent oscillator. It is also possible to use real event hits as calibration hits if the hit rate of the real events is sufficiently high. The calibration block updates the DNL histogram and the calibration LUT automatically and semi-continuously as the real events flow through.

The DNL histogram and the calibration LUT are actually implemented in the same dual-port memory block as mentioned earlier.

For the rising edge calibration table LUTR, the 2304 bits are organized into 2 blocks, 128-word x 9-bit each, one for the DNL histogram and the other for the LUT. The histogram booking and LUT updating processes are controlled by a finite state machine with the following steps:

- 1. Clearing memory area.
- 2. Booking DNL histogram.
- 3. Integrating the calibration LUT.

The input from the TDC encoder for rising edge hits is a 7bit number, i.e., the sum of two 6-bit numbers and a 128-bin DNL histogram is booked while the hit data are read out. If the number of total hits is known, then the counts in each bin can be used as its bin width. For example, if 4095 hits are booked into the histogram and assume these hits are evenly spread over 2500ps, the period of 400MHz clock driving the TDC, then the width of a bin with N count is N\*2500ps/4095 = N\*0.61ps.

Once all hits are booked into the histogram, a sequence controller starts to build the LUT in the FPGA internal memory. The LUT is integrated from the DNL histogram so that it outputs the actual time of the center of the addressed bin. The time value of the first bin is half of the width of the first bin. Then another half bin width of the first bin and the half bin width of the second bin are added to get the center time of the second bin. This sequence is repeated for remaining bins. It is crucial to calculate the calibration values for the centers of the bins. If all the bins had the same width, there would be no difference to calibrate either to the center values or the boundary values of the bins. But when the bin widths are different, calibrating to the center values reduces measurement error significantly.

The look up table for the falling edge LUTF is created similarly except the memory allocated is 1152 bits, which are organized into 2 blocks, 64-word x 9-bit each, for DNL histogram and LUT, respectively.

The users may choose to update the look-up tables with the real events continuously or protect the look-up tables created during power up or special calibration period.

## V. THE BURST SUM BLOCK & ADVANCED TIMING SCHEMES

There are additional challenges beyond implementing the time measurement array when multiple channels are packed in an FPGA and multiple FPGA TDC devices are used for a large time-of-flight (TOF) system. A major task for such systems is to distribute the common timing reference. Two extra TDC channels are implemented in addition to the 16 regular channels for establishing time reference. The two timing reference channels are assigned to two input banks with

8 regular channels in each bank. Temperature variation between the reference channels and regular channels is expected to be mainly cancelled given that they have identical time measurement structure as the regular channels.

## A. The Common Burst Timing Reference

The timing reference channel digitizes rising edges of the input pulses and the times of the pulses are fed into the burst sum block. The time values are summed inside the burst sum block and the result is output as part of the time frame header word.

The inputs of the timing reference channels are bursts of pulses which are automatically recognized by the burst sum block. By setting operating mode pins or internal registers, the users can choose to sum 1, 2, 4 or 8 time values in a burst.

In the special case when the burst is a single pulse, the common timing reference is similar as traditional common start scheme. In this case, the arrival time of the common timing reference is reported in the data header and the time intervals between the common start signal and the individual channels hits can be calculated with the differences of them. Note that neither the common "start" pulse here needs to arrive earlier than the channel hits, nor the channels are "stopped" after being hit.

The primary motivation of using timing reference channels is to support advance timing distribution schemes. In conventional common start/common stop schemes, the common timing signal is distributed in a single shot, suffering circuit jitter and binning errors in TDC. In our design, the reference time inside FPGA is an average of multiple (up to 8) measurements. Multiple measurements provide finer timing resolution than single shot.

In the common burst mode, a burst of 2, 4, or 8 pulses are used as common timing reference signal. The average of times of pulse rising edges is reported in the time frame data header. With an average of 4 measurements, for example, timing jitter is reduced by half and an additional bit of the timing resolution is anticipated.

# B. The Mean Timing Scheme of Common Timing Reference

A very attractive timing distribution method is the mean timing scheme. The mean timing scheme is a special case of the common burst mode.

The timing distribution system drives a multi-drop copper twist pair cable from both ends as shown in Fig. 4. The left and right end drivers are alternatively enabled and drive pulses to travel from left or right end. There is no need to synchronize the pulses. The pulses from left and right drivers can be at any arbitrary times. The differential signals are received in each TDC module/FPGA and the arrival times are digitized.



Fig. 4. The signal distribution system for the mean timing scheme

The mean timing burst has 8 pulses as shown in Fig. 5. The receivers on each TDC FPGA receive the burst with 4 pulses delayed from left path and 4 from right path. The traces represent pulses seen at different TDC modules. The arrival times at different modules are different, but the mean times of the 8 pulses as indicated with the red dots are the same.



Fig. 5. The pulse burst of the mean timing scheme

The only required condition in this scheme is that the cable segments have the same propagation delays for left-going and right-going pulses. There is no requirement on actual values of the delays and temperature variations and therefore, no requirement of using high quality media. Any moderate grade media like Cat-5 twist pair cables or even ribbon cables can serve this purpose. The TDC supports either common burst mode or mean time mode without any changes in the firmware.

## VI. THE OUTPUT DATA

The valid hits from all channels are packed together for output and additional data limit is applied in this stage. Up to 24 hits from all 16 channels during a 5.28  $\mu$ s time frame are packed together with a header and sent out to the first LVDS output port "cUL[0]" running at 193.75 Mbits/s using DC balanced 8B/10B coding. If there are 25 to 48 hits in a time frame, the rest of hits are output from the second LVDS ports cUL[1]. Similarly, if there are 49 to 72 hits or 73 to 96 hits in a time frame, they are output from the third or the fourth ports dUL3 or dUL4p.

The data output is an un-triggered continuous sequence with no dead-time nominally. All hits are digitized and output as long as the conditions of (1) the double hit separation, (2) the single channel hit limit in each time slice and (3) the output capacity of the entire chip are fulfilled. The scheme of the staged jam prevention is suitable for systems with high instantaneous hit rate but relatively low average rate.

This arrangement permits the users to choose appropriate output capacity to fit their specific applications. An application with low hit rate may use only one or two LVDS output ports for simplicity of the system, while a high rate application may use the full output capacity provided by all four ports. A sample of output data block monitored by an onchip 10B/8B decoder and RS232 interface is shown in Fig. 6.

| 🎨 ht - Hyper | Terminal           |              |             |            |            |          | _        |      |
|--------------|--------------------|--------------|-------------|------------|------------|----------|----------|------|
| Ele Edit Yew | ⊆all Iransfer Help |              |             |            |            |          |          |      |
| 0 🗳 🍘 🕻      | 5 -019 🗗           |              |             |            |            |          |          |      |
| -            |                    |              |             |            |            |          |          | -1   |
| 3C3A2D       | 29 54444331        | 464E414C     |             |            |            |          |          |      |
| 2473BB       | 53                 |              |             |            |            |          |          |      |
| 000000       | 00 0000000         | 00000000     | 00000000    | 00000000   | 00000000   | 00000000 | 00000000 |      |
| 000000       | 00 0000000         | 00000000     | 00000000    | 00000000   | 00000000   | 00000000 | 00000000 |      |
| 000000       | 00 0000000         | 00000000     | 00000000    | 00000000   | 00000000   | 00000000 | 00000000 |      |
| 25A191       | 26                 |              |             |            |            |          |          |      |
| 06A190       | C5 16A19153        | 3 26A190DF   | 36A19087    | 46A190C1   | 56A1903B   | 66A18FEF | 76A18F5B |      |
| 86A190       | A5 96A1907         | 7 A6A1918B   | B6A1911D    | C6A1913B   | D6A18FF3   | E6A19009 | F6A18F8F |      |
| 06A1BF       | 2E 16A1BFEA        | A 26A1BF3A   | 36A1BE9E    | 46A1BEF2   | 56A1BE9A   | 66A1BE26 | 76A1BD76 |      |
| 26CF67       | 16                 |              |             |            |            |          |          |      |
| 06CF66       | B7 16CF6743        | 3 26CF66CF   | 36CF6669    | 46CF66AB   | 56CF662F   | 66CF65E7 | 76CF6559 |      |
| 86CF66       | 8D 96CF665B        | B A6CF6775   | B6CF6711    | C6CF672D   | D6CF65EB   | E6CF6601 | F6CF658F |      |
| 06CF94       | FA 16CF95BE        | E 26CF9506   | 36CF947A    | 46CF94BE   | 56CF9476   | 66CF9412 | 76CF934A |      |
| 27FD3D       | 1D                 |              |             |            |            |          |          |      |
| 06FD3C       | C5 16FD3D41        | 26FD3CCD     | 36FD3C69    | 46FD3CB3   | 56FD3C3B   | 66FD3BEF | 76FD3B6D | - 11 |
| 86FD3C       | 99 96FD3C63        | 3 A6FD3D75   | B6FD3D1D    | C6FD3D2D   | D6FD3BF3   | E6FD3C09 | F6FD3B8F |      |
| 06FD6B       | 2E 16FD6BE         | A 26FD6B3A   | 36FD6A9E    | 46FD6ADE   | 56FD6A82   | 66FD6A26 | 76FD694E |      |
| 27FD3D       | 1D                 |              |             |            |            |          |          |      |
| 000000       | 00 0000000         | 00000000     | 00000000    | 00000000   | 00000000   | 00000000 | 00000000 |      |
| 000000       | 00 0000000         | 00000000     | 00000000    | 00000000   | 00000000   | 00000000 | 00000000 |      |
| 000000       | 00 0000000         | 00000000     | 00000000    | 00000000   | 00000000   | 00000000 | 00000000 |      |
| FEDCBA       | 9876543210FE       | EDCBA9876543 | 3210FEDCBA9 | 876543210E | EDCBA98765 | 43210    |          |      |
| Wave U       | nion TDC WUT       | FDC09a FPGA  | jywu168@fr  | al.gov     | 01GT       |          |          |      |
| -            |                    |              |             |            |            |          |          |      |
|              |                    |              |             |            |            |          |          |      |

Fig. 6. The output data

A total of 512 bytes are output in a data block from each LVDS output port.

The data block header contains the first 12 bytes with the following byte definitions:

- The first byte 0x3C is the K28.1 comma code in the 8B/10B encoding table which is used to align the data block for the receiving end.
- The next 3 bytes are ASCII character strings ":-)".
- The subsequent 8 bytes are ASCII character strings "TDC1" and "FNAL" for the readout port cUL[0] or dUL3. For readout port cUL[1] and dUL4, these two strings are "Wave" and "Unio".

The users are allowed to replace these bytes with other information when the data is collected in DAQ system.

Each data block contains 5 time frames, representing a total of 5 x 5.28  $\mu$ s time period. In each time frame, a 32-bit time frame header is followed by 24 hit data, also 32 bits each.

The time frame header words have the following bit definitions:

- Bit 0 to 9: Fine time with full range of 2508 ps.
- Bit 10 to 19: Coarse time with full range of 2.64 µs.
- Bit 20 to 23: Readout time slice.
- Bit 24 to 31: Pulse count.

The time frame header words outputs the average of the leading edge times in a burst with 1, 2, 4 or 8 pulses. The burst summing circuit keeps a counter to count the number of pulses detected and 8 bits are output. The pulse count is used to align events across multiple TDC chips.

The hit data words have the following bit definitions:

• Bit 0, 1: If they are both 0, the data word is not a valid hit. If the hit is a rising edge, Bit 0 = 1 and Bit 1 is the LSB of the fine time. If the hit is a falling

edge, Bit 0 = 0 and Bit 1 = 1.

- Bit 2 to 9: Fine time with full range of 2508 ps and Bit 2 is the LSB.
- Bit 10 to 19: Coarse time with full range of 2.64 µs.
- Bit 20 to 27: Readout time slice. Bit 10 to 27 can be viewed as the coarse time with wider range.
- Bit 28 to 31: Channel ID.

The scales of Bit 0 to 23 are identical with the same bits in the time frame header word. The bits 21 to 27 in coarse time are redundant which are provided for run time checking purpose.

## VII. THE ASIC-LIKE ENCAPSULATION

A feature of this design is the ASIC-like encapsulation. It is known that FPGA TDC is ultra-flexible and suitable for different user projects. However, the FPGA TDC design requires certain carefulness in various aspects beyond typical FPGA digital design practice and it may become a long learning curve for potential users. Our firmware is designed as if the FPGA is used as an ASIC TDC that provides a turnkey solution for users in a wide range of applications.

## A. Register-Setting-Free Operations

Full feature of the device is accessed by control and monitor the device via an on-chip RS232 serial port. However, the device can also operate stand alone without any external control support, which shortens a potential learning curve and eliminates some overhead of the system at the starting stage. Many common operating configurations can be achieved without setting registers which allows the users to use the device with minimum efforts.

When the chip is powered up, it runs an initialization sequence for about 45 seconds, during which the calibration look-up table is established. At the end of the initialization sequence, logic levels of several operation mode pins are sampled to set several internal registers which bring the device into desired operating mode. The users may tie the pins to ground or to leave them un-connected which will be weakly pulled up to high logic level by resisters inside the chip. These pins set the most essential properties of the operation: outputting time of both rising and falling edges or only rising edge; outputting raw data or calibrated data; allowing the calibration look-up table to keep updating or protect current look-up table; number of pulses in the common timing burst.

In this situation, the chip operates stand alone, free of register-setting.

#### B. Jumper-Free Operations

On the other hand, with the support of the RS232 port. The users are allowed to set the registers inside the device for a broader range of the operation options, including overwriting configurations sampled from the mode setting pins at the end of the initialization. In this situation, the jumper setting on the board is disregarded. The device can be considered effectively jumper-free.

## VIII. TEST RESULTS

The 18-channel Wave Union TDC implemented in a low cost Altera Cyclone II FPGA device (EP2C8T144C6). The test module is shown in Fig. 7 and the EP2C8T144C6 is the small 144-pin chip in the middle of the module.



Fig. 7. The test module

1.40E+06

Differential LVDS pulses are input to the FPGA top and bottom I/O banks. Each bank contains 8 regular TDC channels and a timing reference channel. A typical rising edge time difference histogram of two channels is shown in Fig. 8.



Fig. 8. A typical rising edge time difference histogram of two channels

Each bin in the histogram is 20.16 ps wide. The RMS resolution shown above is 24.8 ps.

A typical falling edge time difference histogram of two channels is shown in Fig. 9.



Fig. 9. A typical falling edge time difference histogram of two channels

The RMS resolution shown above is 78.8 ps. For falling edge, the TDC provide a non-wave union single 0-to-1 transition measurement. Therefore, the time resolution is coarser than that of rising edges.

A typical pulse width RMS resolution is 50 ps. The pulse

width is a difference of the times of rising edge and falling edge. A timing resolution is in the middle of the two cases given above, as it is anticipated.

## IX. CONCLUSION

An 18-channel Wave Union TDC implemented in a low cost Altera Cyclone II FPGA device is documented in this paper.

The device is designed with user-friendly ASIC-like encapsulation approach. The TDC fits large range of applications with convenient features.

The device also supports advanced common timing reference schemes such as common burst scheme and mean timing scheme along with traditional common start scheme.

Timing resolution of 25 ps for rising edges meets or exceeds most of time-of-flight (TOF) applications with scintillating counters instrumented with photo multiplier tubes. The pulse width measurement capability enables applications of using time-over-threshold (TOT) information for time walk corrections.

## ACKNOWLEDGEMENT

The authors would wish to express thanks to Mike Albrow, Erik Ramberg, Anatoly Ronzhin, Robert DeMaat, Sten Hansen, Rajendran Raja, Holger Meyer of Fermilab, Fukun Tang, Henry Frisch, Jean-Francois Genat, Chien-Min Kao of University of Chicago, Qi An of University of Science and Technology of China, William Moses, Seng Choong, Chinh Vu and Qiyu Peng of Lawrence Berkeley Lab for their helpful inputs over years.

#### REFERENCES

- A. Amiri, A. Khouas & M. Boukadoum, "On the Timing Uncertainty in Delay-Line-based Time Measurement Applications Targeting FPGAs," in *Circuits and Systems*, 2007, *IEEE International Symposium on*, 7-10 27-30 May 2007 Page(s): 3772 - 3775.
- [2] J. Song, Q. An & S. Liu, "A high-resolution time-to-digital converter implemented in field-programmable-gate-arrays," in *IEEE Transactions* on Nuclear Science, 2005, Pages 236 - 241, vol. 53.
- [3] M. Lin, G. Tsai, C. Liu, S. Chu, "FPGA-Based High Area Efficient Time-To-Digital IP Design," in *TENCON 2006. 2006 IEEE Region 10 Conference*, Nov. 2006 Page(s):1 – 4.
- [4] J. Wu, Z. Shi & I. Y. Wang, "Firmware-only implementation of time-todigital converter (TDC) in field programmable gate array (FPGA)," in *Nuclear Science Symposium Conference Record*, 2003 IEEE, 19-25 Oct. 2003 Page(s):177 - 181 Vol. 1.
- [5] S. S. Junnarkar, et. al., "An FPGA-based, 12-channel TDC and digital signal processing module for the RatCAP scanner," in *Nuclear Science Symposium Conference Record*, 2005 IEEE, Volume 2, 23-29 Oct. 2005 Page(s):919 - 923.
- [6] M. D. Fries & J. J. Williams, "High-precision TDC in an FPGA using a 192 MHz quadrature clock," in *Nuclear Science Symposium Conference Record*, 2002 IEEE, 10-16 Nov. 2002 Page(s):580 - 584 vol. 1.
- [7] J. Wu & Z. Shi, "The 10-ps wave union TDC: Improving FPGA TDC resolution beyond its cell delay", in *Nuclear Science Symposium Conference Record*, 2008 IEEE, 19-25 Oct. 2008 Page(s):3440 - 3446.
- [8] Altera Corporation, "*Cyclone II Device Handbook*", (2007) available via: {http://www.altera.com/}