The authors report the development of an on-chip 500 MHz reference clock generator as a part of a clock manager for a field-programmable gate array. The generator is implemented in the form of an all-digital frequency locked loop (ADFLL) in architecture of low complexity and high modularity. For the development of the ADFLL, they propose a new circuit that employs two under-sampled 1-bit ΔΣ frequency-to-digital converters to convert a frequency difference into a proportional distributed pulsewidth. By the combination of the proposed circuit with a conventional phase-and-frequency detector, a frequency comparator is implemented and can indicate its two input frequency conditions, that is, (i) equal to, (ii) lower than or (iii) higher than. The ADFLL which adopts the proposed frequency comparator is implemented in a 90 nm CMOS technology. Consuming 2.64 mW from a 1.2 V supply, the ADFLL shows about 50 µs of locking time at the frequency accuracy of 99.2% while operating at 500 MHz and being driven by a 10 MHz reference clock.
Introduction
This paper reports the development of an on-chip 500 MHz reference clock generator as a part of a clock manager for a field-programmable gate array (FPGA) . To obtain such a high reference frequency, the clock generator functions as a frequency multiplier and is developed based on a coherent indirect frequency synthesiser [1] . In this work, both the FPGA and the clock manager developed are targeted to be implemented in a fully digital CMOS technology where no components such as inductors, metal-insulatar-metal (MIM) capacitors and varactors are available. Therefore the clock generator as a part of the clock manager has to be implemented in the form of either the all-digital phase-locked loop (ADPLL) or all-digital frequency-locked loop (ADFLL); in this paper, both ADPLL and ADFLL are termed all-digital frequency synthesiser (ADFS).
One of the approaches in developing an ADFS is using a direct digitalisation in which the analogue parts of a phase-locked loop (PLL) or a frequency-locked loop (FLL) are replaced by their digital equivalents. Although there have been a lot of computer-aided design tools for designing an ADPLL or an ADFLL based on the direct digitalisation approach, this approach inevitably involves relatively complex arithmetic operations [2] .
In this work, the ADFS is developed mainly by adopting a binary search algorithm (BSA) [3] , which is considered as a simpler approach. In this approach, the fundamental function of the ADFS can be described with the flowchart shown in Fig. 1a . In the figure, the ADFS consists basically of two functional blocks, that is, a frequency comparator (FC) and a control code generator (CCG). The FC is to detect whether the digitally controlled oscillator (DCO) frequency is equal to or different from the target frequency, and the CCG is to generate the control codes to the DCO so that the corresponding frequency change can be made. In addition to those main blocks, a decision making process is added at the bottom of the flowchart to meet the requirement that a loop in a flowchart may not continue endlessly; in the implementation, this block is represented with a power or reset switch.
In the flowchart presented in Fig. 1a , the control code for the DCO is assumed to be directly proportional to the DCO frequency so that the locking process of the ADFS can be described as follows: when the DCO frequency is the same as the target frequency, the control code will stay unchanged; when the DCO frequency is lower (higher) than the target frequency, the control code to the DCO will be increased (decreased) so as to bring the DCO frequency to the target frequency. In Fig. 1a , to distinguish the two decision-making subblocks in the FC, the subblock that detects whether both frequencies are the same is called a frequency estimator (FE) , and the other subblock that detects whether a frequency is lower or higher than the other one is called a frequency detector (FD) .
Although the algorithm represented by the flowchart in Fig. 1a flowchart is shown in Fig. 1b . In the latter figure, the FC is now supplemented with a functional block that will first process the frequency or phase error before the FC concludes whether the DCO frequency is equal to, lower than or higher than the target frequency. In an ADPLL developed based on direct digitalisation of a PLL, the FC is implemented basically by employing a phase-and-frequency detector (PFD), a time-to-digital converter and a digital filter [4] . An ADFLL developed based on a direct digitalisation of an FLL [5] works almost by the same principle and the FC is implemented basically with a 1-bit ΔΣ frequency-to-digital converter (ΔΣFDC), a frequency-error comparator and a digital filter.
The ADFSs in [2, 3, 6] are developed based on the BSA approach. The FE of the ADFLL in [3] is implemented with a control logic circuit that contains both adding and subtracting circuits for measuring the magnitude of the phase and frequency differences. In [6] , the FD of the ADPLL is implemented with a conventional PFD [1, 4] , whereas its FE is implemented with a finite state machine (FSM) circuit that processes further the output of the FD to generate the indication of a locked state. The ADPLL in [2] is probably the ADFS that can implement an FE in the simplest way. However, to enable such a simple FE, its FD has to be implemented in a rather complex way than the FE. Moreover, the FC (FD and FE) in [2] must be accompanied by a high frequency-resolution DCO, for example, LC tank-based DCO, to operate as designed.
Based on the above study, we can deduce that in developing an ADFS, the FD, that is, the circuit to detect whether one frequency is lower or higher than the other one, can actually be implemented with quite a simple circuit. Therefore the additional process in the FC of the flowchart shown in Fig. 1b is strongly related to the complexity occurring in the implementation of the FE rather than the FD. Therefore, we conclude that if the FE, that is, the circuit to detect a frequency similarity, of an ADFS can be implemented in a very simple way, then the whole architecture of the ADFS will also be very simple.
In this paper, we propose a new and simple FE that enables the implementation of a new ADFS in architecture of low complexity and high modularity. The resulting ADFS architecture strongly resembles the fundamental flowchart shown in Fig. 1a . Since the flowchart in Fig. 1a processes only frequency but not phase, the resulting ADFS will eventually be a type of ADFLL. Moreover, from the application standpoint, this ADFLL will be fully functional as a 500 MHz reference clock generator for an FPGA.
The rest of this paper is organised as follows. Section 2 presents the development of the proposed FE as well as the new ADFLL for a DCO with fine frequency-resolution. Section 3 describes the further development of the ADFLL to accommodate a DCO with coarser frequency-resolution. Section 4 presents the measurement results of the implemented ADFLL. Finally, Section 5 concludes this paper.
2 Development of the proposed ADFLL Fig. 2a shows the architecture of the proposed ADFLL which is a straightforward translation of the flowchart shown in Fig. 1a . In Fig. 2a , F REF is the external reference clock frequency which represents the target frequency and F FB is the feedback frequency which basically represents the DCO frequency or the output frequency F OUT , which is programmable. A divider DIV is added into the diagram because the ADFLL will function as a frequency multiplier. The number of the CCG's output bits depends largely on the requirements of the DCO; in this work, it is nine. In the following subsections, we present the circuits adopted to implement the proposed ADFLL, along with the systematic steps in designing the FE suitable for the proposed ADFLL. The design of the FE is the main work in this study. 
Control code generator
According to the schematic shown in Fig. 2a , the 9-bit CCG should be implemented with a circuit that can increase, decrease and hold its output binary code. For this purpose, the CCG is implemented with a conventional JK flip-flop up-down counter (UDC) [7] . The adopted UDC is falling-edge sensitive which changes its output at the falling edges of F REF ; this is indicated by the small circle attached to Port CLK of the CCG shown in Fig. 2a .
Frequency detector
In the proposed ADFLL, the FD detects whether the frequency (not phase) of F FB is lower or higher than that of F REF . The circuit to perform this function is adopted from [8] and its diagram is shown in Fig. 2b . When the frequency of F FB is lower (higher) than that of F REF , the output will be 'true (false)' or 'logic 1 (0)'.
Divider (DIV)
For the proposed ADFLL the divider in [9] , which is a clock swallowing multi-modulus divider (MMD), is adopted since it gives a 50% duty-cycled output. In this work, the implemented divider is designed to provide integer feedback ratios ranging from 16 to 63.
Digitally controlled oscillator
Owing to the limitation of the process library and chip area, the DCO construction is decided to be based on a CMOS digital inverter adopted from [10, 11] . The schematic diagram of the implemented DCO is shown in Fig. 2c . The DCO totally requires nine bits of binary code for its control code, which are denoted as CC[0:8] in the diagram. Six bits, CC[0:5], directly control the DCO for fine frequency tuning. The remaining three bits, CC [6:8] , are first converted, with a binary-to-thermometer decoder, into seven bits of thermometer code, which are denoted as T [0:6] in the diagram. The resulting bits T[0:6] are then used to control the DCO for coarse frequency tuning. The resulting DCO has output frequency directly proportional to the control code.
A true single-phase clock (TSPC) prescaler is added to improve both the frequency resolution and the duty cycle. To accomplish the target specifications of the DCO, that is, a DCO with a centre frequency of 500 MHz and a frequency resolution of 1 MHz, the sizes of the transistors of the inverters are manually adjusted one by one.
Development of the FE
As emphasised in the introduction, there has not been a simple FE that is suitable for the proposed ADFLL. In the absence of the FE, the final output frequency of the proposed ADFLL will be determined by both the FD and F REF which control the CCG. However, the speed of the FD in detecting which input frequency is lower or higher is much lower than F REF ; this speed difference creates an exaggerated time span in which the CCG changes the control code (thus, also F OUT ) in the opposite direction to where the control code should be directed. As a result, the ADFLL without the FE will not be able to lock the target frequency.
By considering the architecture of the proposed ADFLL shown in Fig. 2a , the above unlocking problem should be solved by the FE, whose output controls Port Enable/Hold of the CCG, by decreasing the effective rate of the CCG in generating the control code as the frequency difference decreases. On combining this requirement with the main function of the FE described by the flowchart in Fig. 1a , the required FE should exhibit the following two behaviours: (i) the FE should generate an output whose pulse (logic 1) width decreases as its input frequencies are going closer to each other, and (ii) the FE's output should be completely logic 0 to totally disable the CCG when the FE's input frequencies are exactly the same.
In this work, the required FE is developed by employing a frequency subtracting circuit (FSC) [12] whose output is connected to a clocked level-change detecting circuit so that the pulse or logic 1 only occurs around the rising and falling edges of the FSC output. Since the number of edges is directly proportional to the frequency, the output pulse (logic 1) width of the level-change detecting circuit is also proportional to the frequency difference at the input of the FSC.
In the ADFLL, F FB as the sensed frequency can be either much higher or lower than the reference frequency F REF .
To enable the FE to work properly in both conditions, two identical circuits, each consisting of the FSC and the level-change detecting circuit, are implemented as the FE. In the first circuit, F REF samples F FB , whereas in the second circuit, F REF is sampled by F FB . Fig. 3 shows the schematic diagram of the proposed FE.
The constituting circuit of the FE shown in Fig. 3 does not look uncommon since it is well known as a ΔΣ modulator [13] or a ΔΣFDC [5, 14] . The differences are only in the applications. The input frequencies of the ΔΣFDC in [5, 13, 14] are largely different since one of them functions as a sampling clock to digitise the other, whereas in the proposed FE, the ΔΣFDC circuit works in the under-sampling mode. Moreover, the output signal Diff /Equ of the proposed FE is simply treated as a 1-bit digital signal, whereas the output of the conventional ΔΣFDC is usually treated as serial data that need to be further processed or interpreted. Fig. 4a shows the comparative simulation results of six FEs for the six input frequency differences of the same simulation time while using the same F REF . The decreasing frequency difference is to simulate the condition in the ADFLL when F FB approaches F REF as happened in the locking process. The result proves that when the absolute frequency difference decreases, occurrences of the output pulse (logic 1) are becoming rarer, which means that the total pulsewidth over a certain duration is proportional to the frequency difference. Moreover, based on the property of the FSC [12] , we can deduce that when both input frequencies become exactly the same, the output of the FE will be completely equal to logic 0, which conforms to the required behaviour of the FE.
Another important property of the proposed FE is that when its input frequencies (F REF and F FB ) are becoming very close to each other, every single pulse (logic 1) in Diff /Equ will be almost always as wide as one period of F REF , as partly indicated in the lower row of Fig. 4b . Thus, only one falling edge of F REF will occur when the state of Diff /Equ is logic 1 or the CCG is enabled. As a result, in this condition, the CCG can increase or decrease its output only as much as one LSB. This condition is expected with high probability in the steady state because the CCG will be able to tune the DCO with a frequency step as fine as the DCO's frequency resolution.
Figs. 5a and b plot simulation results showing the behaviours of the proposed ADFLL after adopting the proposed FE in locking the target frequency. The data plotted in Fig. 5a are the control codes observed at the CCG output, CC[0:8]. It indicates that the proposed FE can effectively help the ADFLL to lock the target frequency precisely.
Development of an intermittently tracking ADFLL
In Fig. 5a , we can still observe the alterations of the control code in the steady state. According to the previous discussion, regarding the simulation result of the FE plotted in Fig. 4b , the control code changes only between two This means that in the steady state, the output frequency of the ADFLL, F OUT , moves back and forth between two frequencies that are slightly higher and lower than the target frequency, respectively, and the difference of these two frequencies is equal to the DCO's frequency resolution.
If the implemented DCO has a very fine frequency resolution, for example, similar to an LC tank-based DCO, then the difference of those two frequencies would be very small. However, in this work, the implemented DCO shows only a frequency resolution of about 1 MHz at the working frequency of 500 MHz. The frequency jump because of this relatively poor frequency resolution, as indicated in Fig. 5b , will manifest as considerable deterministic jitters and may deteriorate the performance of the ADFLL. To handle such a problem, we decide to adopt a mechanism that opens the loop of the ADFLL during the steady state and intermittently closes it when the output frequency needs to be corrected [15] .
To adopt this mechanism, the initially-proposed ADFLL needs to be supplemented with the following two circuits, that is, one circuit to open the ADFLL's loop when the ADFLL is in the steady state, which is called a loop-opening circuit (LOC), and the other circuit to re-close the ADFLL's loop when the ADFLL's output frequency needs to be corrected, which is called a loop-closing circuit (LCC).
Loop-opening circuit
According to the previous discussion, when the loop of the ADFLL is opened during the steady state, the ADFLL frequency will be locked to a frequency whose distance to the target frequency is not more than the frequency resolution of the DCO. In this design, the opening of the loop is controlled by the LOC and the key information used to develop the LOC is the output of the FD that changes its state whenever the polarity of its input frequency difference is inverted, that is, from the states of 'F FB > F REF ' to 'F FB < F REF ' or vice versa. The polarity inversion means that the DCO frequency just crossed the target frequency and indicates that the ADFLL has entered the steady state.
The schematic diagram of the LOC is shown in Fig. 6a , and it works as follows: at the beginning of the locking process, the up-counter output is reset to zero; the counter then counts the occurrence of the state change of the FD output, LO/HI; when the number of the occurrences reaches two, the output signal of the LOC, LOCK, will go to logic 1 and disable the CCG so that the output control code of the CCG will stop changing and the loop is considered as being open.
Loop-closing circuit
The LCC function is to re-close the loop by resetting the LOC whenever the output frequency of the ADFLL, F OUT , needs correction. Port DIFF of the LOC as shown in Fig. 6a is dedicated for this purpose. The first step in designing the LCC is defining the range of the frequency deviation, called Err Frq , in which F OUT may still be considered in the locked state. The value of Err Frq , should be higher than the relative frequency resolution of the DCO, that is, the ratio between the DCO's frequency resolution and the working frequency so that, in this design, Err Frq should be higher than 1 MHz/ 500 MHz or 0.2%. If Err Frq is set lower than this value, whenever the LOC opens the ADFLL's loop, the LCC will immediately generate the reset signal DIFF to the LOC and the loop will be closed again. In this design, Err Frq is set to around 0.8% in order to provide some error margin because of the performance of the implemented DCO which is based on a digital inverter and usually has more coarse frequency resolution and larger jitter compared with those of an LC tank-based DCO or a CMOS ring-based DCO [5, 16] .
To implement the value of Err Frq into the LCC, we first define the valid duration in which the frequency deviation is measured. This duration is called the gating time T G and its length is equal to a certain number of periods of F REF defined as N G . Later, the value of Err Frq is converted into a fraction which is equal to ΔN/N E ; where N E is the expected number of the periods of F FB that occurs in one T G , and N E + ΔN is less than N G . Afterwards, in the operation, the LCC will count the number of periods of F FB , simply called N, in one T G . As long as N is between N E + ΔN, the loop is kept open or there is no generation of the reset signal DIFF to the LOC. Fig. 6b shows the complete schematic diagram of the LCC. In the figure, the LCC_TG is the circuit to define the duration of T G . Later, the number of F FB 's cycles in one T G is counted by the LLC_FFB that contains the predefined values of N E − ΔN and N E + ΔN. To make the LCC more immune to the generation of false outputs and also to provide a level of certainty to its outputs, the LCC evaluates four consecutive T G s before generating the reset signal DIFF. The evaluation results of the four T G s are stored in four LCC_LATCHs, whose individual diagram is shown in the inset of Fig. 6b . The truth table of the LCC_LATCH attached to the figure describes the states of the LCC_LATCH output, that is, OLO and OHI, as a function of the number of F FB 's cycles in one T G . The final output of the LCC is annotated as DIFF; its state is logic 0 when the frequency deviation of F OUT is still in the tolerable range, and its state becomes logic 1 when the deviation is beyond the boundaries. The rising edge of this DIFF will reset the LOC through its input port DIFF. Fig. 7a shows the final architecture of the implemented ADFLL after adopting the LOC and LCC. Fig. 7b plots the simulation result to verify the functionalities of the LOC and LCC. In the simulation, the frequency deviation at F OUT is emulated by alternating the feedback ratio, which is represented with the state of the MMD's P[0] plotted in the upper row of Fig. 7b . The alternations of the feedback ratio cause F FB to exhibit frequency changes. Later, these frequency changes are detected by the LCC and subsequently trigger the LOC to re-close the loop for correcting F OUT ; this sequence of events is confirmed with the regions marked with (i) in Fig. 7b . After both the frequencies of F OUT are updated and the ADFLL reaches the new steady state, the LOC disables the CCG so that the control code stays unchanged in the steady state; the effectiveness of the LOC is confirmed with the regions marked with (ii) in Fig. 7b .
Final architecture of the implemented ADFLL

Experimental verifications
The proposed ADFLL has been implemented as a part of an FPGA's clock manager in a 90 nm CMOS technology, consuming an effective area of 200 µm × 100 µm as shown in Fig. 8 . The figure also shows the micrograph of the clock manager (lower left) and the test board (lower right) on which the clock manager chip is attached. The clock manager consists of the ADFLL, a delay locked loop (DLL) and a phase shifter. The whole test board of the clock The first specification we verified in the measurement was the operating frequency of the implemented DCO; it ranged approximately from 496 to 623 MHz which covered the main target frequency of 500 MHz. Fig. 9a shows the frequency spectrum of F OUT when the ADFLL is locked at the target frequency of 500 MHz. Fig. 9a also shows that the reference spur is around − 66 dBc which is relatively decent [17] . Ideally, in the locked stated in which the loop of the ADFLL is open, the DCO should generate a spurless output. The occurrence of the spurs is very likely because of the use of single power supply for the whole clock manager without separating the power supply for the DCO. In the next experiment, the quality of F OUT in the time domain was also verified. A screen capture of this measurement result is presented in Fig. 9b , showing that the rms jitter is around 25 ps, which is a relatively common value for a DCO based on a CMOS digital inverter [11, 18] .
The locking time of the ADFLL was examined through the following experiment. F REF was set to 10 MHz and the feedback ratio was initially set to 51. This setup results in the target frequency of 510 MHz. Later, the feedback ratio was repeatedly alternated between 51 and 50. This alternation made F OUT move back and forth between 510 and 500 MHz or as far as one F REF . Although the feedback ratio was being alternated, the locking time was observed through Port LCC_LOCK of the ADFLL which is the inverted output of the LCC as depicted in Fig. 7a .
A screen capture of this experimental result is presented in Fig. 10 ; the upper row shows the waveform of the signal injected into the MMD's P[0] to change the feedback ratio, whereas the lower row shows the waveform of LCC_LOCK. As indicated in the figure, the change in the feedback ratio triggers the LCC to momentarily change the state of LCC_LOCK. The duration of this momentary change is equal to the locking time of the ADFLL and the measured value was around 50 µs. The monotonous logic 1 in LCC_LOCK indicates that F OUT is already in the range of 99.2% of the target frequency, as specified in the design of the LCC.
In the measurements, the effectiveness of the LOC and LCC were only indirectly observed. The fact that the ADFLL can lock the target frequencies as shown in Figs. 9a, b and Fig. 10 proves that the LOC worked properly. Fig. 10 also shows that the ADFLL can correct its output frequency which implies that the LCC also worked properly. Table 1 presents the performance summary of the implemented ADFLL and four published ADFSs for comparison purpose. From the table, it can be found that the settling time of the proposed ADFLL is considerably short, especially when it is compared with the other ADFLL [5] . Our ADFLL's performance can also be considered superior in terms of the area and reference jitters. Moreover, we can also see that although the rms jitter of the implemented ADFLL is quite high, it is still on par with those of the other ADFSs whose DCOs are based on a CMOS digital inverter [11, 18] . On the other hand, the power consumption of the proposed ADFLL is found to be little more than twice that of the ADFLL in [5] for the same output frequency. The reason is that the implemented DCO, as shown in Fig. 2c , actually works at twice the frequency of F OUT or at 1 GHz, which consequently degrades significantly the score of the figure of merit (FOM) [18] . Besides, the measured power is consumed not only by the ADFLL but also by the whole clock manager, which also contains both a DLL and a phase shifting circuit. 
Conclusion
In this work, we report a novel ADFLL as a 500 MHz reference clock generator for an FPGA. The key circuit in the ADFLL is a frequency-difference to pulsewidth converter (the FE) which consists of two 1-bit ΔΣFDCs that are designated to work in the under-sampling mode. The adoption of the proposed FE has enabled the development of an ADFLL that has low complexity and high modularity, consumes small chip area and exhibits short locking time. Moreover, the adoption of the intermittently-tracking mode has probably contributed in significantly suppressing the reference spurs. 
Acknowledgment
