Abstract-It is well known that timing jitter can degrade the bit error rate (BER) of receivers that recover clock information from the input data. However, timing jitter can also result in an indefinite increase in the settling time of clock recovery circuits at the receivers, particularly in low swing mesochronous systems. Mesochronous clock retiming circuits are required in repeaterless low swing on-chip interconnects in order to sample the low swing data at the center of the eye. This paper discusses the settling time of these circuits. First, a discussion on how timing jitter can result in large increase in the settling time of the clock recovery circuit is presented. Next, the circuit is modeled as a Markov chain with absorbing states. Here, the mean time of absorption of the Markov chain, which represents the mean settling time of the circuit, is determined. The model is validated by using behavioural simulations of the circuit, the results of which match well with the model predictions. The modelling is applied to study the effect of different types of jitter, like data dependent jitter of 1 bit and 2 bits, random jitter and random jitter along with 1 bit data dependent jitter. Finally, a few techniques of reducing the settling time are presented and their efficacy is confirmed with circuit simulations.
I. INTRODUCTION
Timing jitter in the incoming data degrades the bit error rate (BER) of receivers that recover clock from the data itself, and this effect has been studied [1] , [2] . However, the dependence of settling time of clock retiming circuits on the jitter in the incoming data has not been investigated before. This paper discusses the effect of timing jitter on the settling time of mesochronous clock retiming circuits. Mesochronous clock retiming circuits are required in repeaterless low swing on-chip interconnects to sample the data correctly at the receiver [3] , [4] , and also in off-chip links that use a forwarded clock [5] , [6] . In mesochronous receivers, a clock running at the correct frequency is available at the receiver and only the correct phase needs to be recovered. Hence, such clock retiming circuits use delay lines [3] , [4] or phase interpolators [7] to generate the required clock phase. Use of delay based retiming circuits is preferred over systems that use phase locked loops with a voltage controlled oscillator per channel [8] due to their better performance and lower complexity. Delay based retiming circuits are preferred even in systems where the clock frequency is known only nominally and not exactly.
In this paper, we will investigate the settling time of mesochronous clock retiming circuits. The effect of timing jitter on the settling time of this type of synchronizers is investigated. We will show how timing jitter can increase the settling time of clock recovery circuits indefinitely. This occurs when the circuit wakes up with its clock in the horizontally closed region of the data eye. Typical systems implemented on-chip can have hundreds of long interconnects [9] , [10] , and a horizontal eye opening of about 85% is not pessimistic [11] , [12] . If the initial clock position is assumed to be uniformly distributed, there is a 15% chance that the circuit wakes up with its initial clock phase in the closed region of the eye, making it important to study and understand this problem. Circuit level simulations show that the system can remain stuck in the window for thousands of cycles.
To analyze the settling time, we model the clock retiming circuit as a Markov chain with absorbing states, where the states are the clock positions. The state transitions correspond to the phase corrections done by the clock recovery circuit. The mean settling time of the circuit is predicted by the model. The model provides useful insights into the dynamics of the system and its predictions fit the data obtained from behavioural simulations. Using these insights, some techniques of reducing the settling time are proposed. The efficacy of these techniques is then confirmed with circuit level simulations.
The paper is organized as follows. Section II discusses the settling time of mesochronous clock retiming circuits and introduces the dependence of settling time on jitter. Section III describes the behaviour of phase detectors and clock retiming circuits in the presence of jitter. Markov chain modeling of the mesochronous synchronizers is presented in Section IV. Quatification of the settling time is discussed in Section V. Section VI discusses techniques of reducing the settling time and supporting circuit sumulation results, which is followed by conclusions in Section VII.
II. SETTLING TIME OF THE CLOCK RETIMING CIRCUIT
In order to analyze the settling time of a mesochronous system, we consider a repeaterless interconnect system as shown in Figure 1 . Here, the delay of the interconnect is expressed as (n + λ)T , where n ∈ Z + , λ ∈ [0, 1) and T is the system clock period. φ T x and φ Rx are the phases of the transmitter and receiver clocks respectively. The clock retiming circuit derives the sampling clock φ d , which is positioned at the center of the input data eye, from the receiver clock phase. The settling time of the mesochronous clock retiming circuit depends on the initial phase error between the clock and the data. This initial phase error (∆φ) is a continuous variable taking values in [0, 2π). When the initial phase error is less than π, the circuit achieves lock by decreasing the phase to 0. On the other hand, when ∆φ is greater that π, the circuit achieves lock by increasing the phase difference to 2π, which is the same as 0 by phase wrapping (refer Fig. 2 ). The settling time of the clock recovery circuit depends on the gain of the system and the initial phase error. For mesochronous systems only the phase has to be corrected and the clock recovery circuit can be of the first order [13] . Hence, the loop filter is a single capacitor. When a bang-bang phase detector is used, the capacitor voltage (V c ) is quantized to a step size given by
Here, K CP is the gain of the charge pump, expressed in terms of the charge pump current I CP and the clock period T . Hence, the step size of the phase corrections is
where K V C is the gain of the phase modulator. The number of phase correction steps (M ) needed for achieving lock can be written as
The settling time can then be written as
where α is the data activity factor. From (1), (2) and (3), the settling time can be expressed as
A. Extension of the settling time
When the initial phase difference is equal to π, the circuit can settle to two discrete values of the phase difference, which are 0 and 2π (which is same as 0 by phase wrapping), and both the solutions are acceptable. Since the initial phase difference is a continuous variable, the above scenario is one of taking a discrete decision on a continuous input. Theoretically, such decisions can take infinite amount of time for certain initial conditions [14] . This happens when ∆φ is exactly π radians and the system has no reason to choose one solution over another. This is akin to metastability in flip-flops. When ∆φ is exactly π radians, the flip flops in the phase detector become metastable. In these conditions, the phase detector loop enters a state of indecision. However, for sustained indecision of the phase detector loop, the flip-flops in the phase detector must become metastable in every clock cycle. Practically, however, given the narrow widths of the metastability windows, the fast recovery times and the inherent jitter present in the clocks, sustained loop indecision due to flip-flop metastability is highly unlikely. Hence, metastability of the flip-flops in the phase detector is not discussed in this work.
B. Effect of timing jitter: The window of susceptibility
When the data input has timing jitter, due to Inter Symbol Interference (ISI) in the data and/or random jitter in the clock, and the initial clock phase (φ 0 d ) is in the horizontally closed region of the data eye, the expression of settling time in (4) is not valid. (W) henceforth. When the initial clock is in this window W, the output of the phase detector is randomized by the jitter in the data, and hence, the phase error information is lost. This increases the settling time t s indefinitely, till the system escapes this window W. The width of the window (T W ) depends on the amount of timing jitter present in the system and the threshold of the sampling comparators. The effect of offset on T W is illustrated in Figure 4 .
For simplicity of analysis, we assume that the samplers have zero offset in the rest of this work. The results can however be easily extended once an appropriate window W is defined after accounting for the offset.
III. WORKING OF PHASE DETECTORS IN THE WINDOW OF

SUSCEPTIBILITY
The timing diagram of the Alexander phase detector [15] is shown in Figure 5 bit period and takes a binary decision of shifting the clock to the right or to the left, based on the last three samples. When there is no ISI, the phase detector consistently produces either UP or DN pulses. Thus, for a given data activity factor α, the settling time can be calculated using the expression in (4). Figure 5 (b) shows the timing diagram of the Alexander phase detector when the data is corrupt with timing jitter and the initial phase error is close to π radians, i.e. the initial clock position is in the window W. Due to jitter, the sample of the data signal taken close to the data transition, the value of which determines the phase detectors decision, causes the phase detector to generate UP or DN signals depending only on the current data transition time and not on the average data arrival time. Hence, the phase detector produces UP and DN pulses randomly and the clock recovery circuit can remain stuck, with its clock in this region, indefinitely.
Linear phase detectors, like the Hogge phase detector [16] , behave similar to binary phase detectors in the window of susceptibility. The ideal characteristics of the Alexander and Hogge phase detectors are shown in Figure 6 . The highlighted region shows the window W, and within this window the variation in the gain of the phase detector is negligible. Hence, the analysis of the system behaviour in the window W is applicable to both these types of phase detectors. Circuit level simulations show that the system can remain stuck in the window for thousands of cycles. For demonstration purposes, we use the synchronizer for on-chip interconnects proposed in [4] . Fig. 7 shows the block diagram of this synchronizer. In this circuit, a delay locked loop (DLL) is used to generate multiple phases of the clock. A controller picks one of the phases of the DLL and delays it to bring the output clock to the center of the eye using a voltage controlled delay line (VCDL). If the VCDL range is not sufficient to achieve lock (which is detected by the control voltage V c exceeding preset bounds V H and V L ), the controller automatically picks the next adjacent phase and re-attempts to lock. The process repeats till lock is achieved. Fig. 8(a) shows the eye diagram at the receiver input of a typical low swing interconnect. Here, T W is the width of the window W. Fig. 8(b) shows the control voltage evolution of the clock retiming circuit when the circuit is stuck with its clock in this window W for an extended period of time. The simulation was done in UMC 130 nm CMOS technology with a 10 mm interconnect and a low swing transmitter.
IV. THE MARKOV CHAIN MODEL OF THE CLOCK RECOVERY CIRCUIT
In order to analyze the settling time, when the initial clock is in the window W, the circuit is modeled as a Markov chain with absorbing states. The binary phase detector produces UP or DN pulses on every data transition, which are converted into an analog control voltage using a charge pump. The control voltage is used to delay the clock (either using a phase interpolator or a delay line). Since the input is binary, the output phase is quantized. Hence, the clock position can be discretized to the step size (τ ) of the phase detector loop update. In order to keep the jitter in the recovered clock low, controllers typically use step sizes of less than 0.1% of the clock period [7] , [13] . The region where the input data eye is closed, i.e. the window W, is of particular interest. A Markov chain model of the system is constructed, in which the states designate the clock positions and the phase corrections performed by the clock retiming circuit form its state transitions. The edges of the window W are modeled as absorbing states. The sources of timing jitter are data dependent (ISI induced) jitter in the data and noise induced random jitter in the clocks of the transmitter and receiver. The analysis of each type of jitter is done independently, followed buy an analysis for data dependent and random jitter together. For analyzing the effect of data dependent jitter, an interconnect link with different bandwidth's is considered. The interconnect is modeled as a 20 section RC network. For simulating different amounts of ISI, different values of RC time constants are chosen and the eye diagrams and timing jitter histograms for a few considered channel bandwidths are listed as cases 1 through 3 as follows:
Case1 For benign channels the horizontal eye opening approaches 100% and the jitter histogram has a narrow distribution as shown in Figure 9 .
One UI Case2 As the bandwidth of the channel decreases, the jitter histogram splits and produces two distinct peaks [17] . This shows that the ISI due to the immediate previous bit is dominant ( Figure 10 ).
One UI 
AB
Case3
Further reduction in the bandwidth shows that the jitter histogram splits into 4 regions as shown in Figure 11 . This shows that the ISI due to previous two bits is dominant.
One UI A B C D For modeling effect of random jitter, the random jitter is assumed to have a gaussian distribution. The system is modeled for the cases when the jitter is A. induced by ISI due to 1 previous bit, B. induced by ISI due to 2 previous bits, C. random with a Gaussian distribution and D. induced by random jitter and ISI due to 1 previous bit.
A. Markov chain model for jitter induced by 1 bit ISI
When the ISI due to the immediate previous bit is dominant, the zero crossings of the data are bunched into two narrow distributions as shown in Figure 10 . This is approximated to the data signal following one of two distinct traces. Figure 12 illustrates an eye diagram with ISI due to exactly 1 previous bit. Here, t i is the initial sampling instant of the clock, γ is the distance to the right edge of the window and V th is the threshold of the samplers in the phase detector. A and B represent the two distinct zero crossing times of the data signal. For the eye diagram shown in Figure 10 , the size of the window W is about 4% of the bit period. This means T W is about 40τ . Assuming that the source outputs 1 and 0 with equal probability, the bit combinations that result in traces with zero crossing at A and B, respectively, are listed in Table I . Here, b 0 is the current bit corrupt with ISI due to b −1 . A data transition occurs when b 0 = b 1 . Table I lists all 3 bit combinations which cover all possible data traces. 
Here, LT and RT indicate that the clock is shifted to the left and to the right respectively, by a step of size τ . N A indicates no corrective action in that cycle. When all sequences are equally likely, the probabilities P (RT ) = 0.25 = P (LT ) and P (N A) = 0.5. Note that the system escapes the window of susceptibility W when, for the first time, either
Here, n R and n L are the total number of times the clock position has been shifted to the right and to the left, respectively, from start-up.
This system is modeled as a one dimensional Markov chain. The states of the Markov chain are the positions of the sampling clock in the window W. Once the clock escapes this window, the time taken to lock to the center of the eye can be calculated from (4). Thus, the edge positions of the window are modeled as absorbing states. This makes the model a Markov chain with absorbing states [18] . Figure 13 shows the state diagram of the Markov chain for this system. By knowing the state space and the transition probabilities of a Markov chain, one can calculate the mean time to absorption from any initial state [18] . The calculation of the mean time to absorption (and its variance) are presented in Appendix A.
The combined plots of the mean time to absorption for the 1 bit ISI case obtained from behavioural simulations and Markov chain model predictions are shown in Figure 14 predictions were computed by solving for the mean absorption time of this Markov chain using a linear equation solver, while the behavioural simulations were done using a 20 section RC interconnect and a VerilogA behavioural description of the clock retiming circuit. As one would expect, the settling time is maximum when the initial sampling phase is at the center of the window W.
It is worth noting that the variance in the absorption time is quite high. Figure 15 shows the standard deviation of the absorption time as predicted by the Markov chain model and as observed in the data obtained from behavioural simulations. 
B. Markov chain model for 2 bits ISI
When the ISI due to previous two bits is significant, the data transitions histogram has 4 peaks as shown in Figure 11 . This can be approximated to ISI of two bits which results in 4 distinct data transition times. The second order Markov chain model for this system is shown in Figure 18 . To capture the memory of the system, six states are used for each clock position. Four of these states correspond to the previous data transition (T n−1 ) occurring at A, B, C and D. The remaining two states are i) X
(1) which indicates no data transition in the previous cycle and ii) X (2) which indicates no data transition in the previous two cycles.
For example, when T n−1 = A, the source can be in either state S 2 or S 5 (refer Figure 17) , and the next cycle will either not have a data transition, or have a data transition at position B. When the initial clock is in the sub-window W B−C , a data transition at B results in a clock shift to the right by 1δ. Thus, the transitions from state at T n−1 = A are either to state X (1) at the same clock position or to state T n−1 = B of the next clock position to the right. Similarly, by analyzing the FSM in Figure 17 , the rest of the state transitions can be determined. In Figure18, the transitions are shown for only one clock position for brevity, for the clock positions in the sub-window W B−C . All the state transitions have a probability of 0.5 when the data source outputs 0 and 1 with equal probability.
Similar to the 1 bit ISI case, the mean time to absorption for the 2 bit ISI case was determined using a Markov chain model. Also, behavioural simulations were performed. For the eye diagram shown in Figure 11 , the window size is about the interconnect, whereas the Markov chain model truncates the ISI to 2 bits. Note that the peak of the mean time to absorption, as predicted by the model, is about 1100 cycles when the step size is 3.5τ . This can be as high as 12000 cycles when the step size is 1τ .
C. Markov chain model for Gaussian distributed random jitter
This section discusses the increase in the settling time due to random jitter in the clock. There are two sources of clock jitter in digital systems: inherent jitter in the clock generating oscillator and jitter introduced by the clock distribution network. The former can be modelled as a Gaussian distribution [19] . The latter, however, depends on several factors and can be random with arbitrary distribution [20] . The analysis in this section assumes that the jitter has a Gaussian distribution.
For ease of analysis, it is assumed that the receiver clock is noise free and all the jitter is in the transmitter clock. This assumption is valid as long as the channel response is constant over the spectrum of the noisy clock. The jitter histogram is assumed to be a Gaussian distribution, with a standard deviation of σ ck . The sampling clock position (t ck ), with respect to the mean data transition position, can be represented as as mτ , where m is an integer taking values from −∞ to ∞. Hence, with respect to the sampling clock position, the mean of the data transition can be written as −mτ . Unlike the analysis for ISI induced jitter, wherein the window of susceptibility W was bounded, random jitter is unbounded and a window W of finite size cannot be defined. For tractable analysis, a window size of ±3σ, i.e. |mτ | < 3σ ck , is used. This covers > 99.999% of the possible data transition positions.
When the clock transition is to the right of the mean data transition position (m > 0), the circuit shifts the clock to the right towards final lock (m is incremented). However, even when m > 0, the random jitter in the data can cause the instantaneous data transition to occur to the right of the sampling clock position. This results in a clock shift in the wrong direction. Figure 20 shows the probability distribution of the clock transition time and the data transition position. The shaded area represents the probability that the data transition occurs to the right of the sampling clock position, when the mean of the data transition is to the left of the sampling clock transition (m > 0). This is the probability of a phase update in the wrong direction. This probability can be calculated as follows.
The probability of an update in the correct direction can then be calculated as P (update : right) = 1−P (update : wrong). Note that these probabilities will be scaled by the probability of transitions. For a source which outputs 1 and 0 with equal probability, the probability of transition is also 1/2. Since the data transitions are independent of the clock jitter, the probabilities can be multiplied. This equation is then used to construct the probability transition matrix of the Markov chain, to predict the mean time to absorption and the standard deviation of the time to absorption. As noted earlier, absorption in this case is defined as the phase difference between the mean of data position and sampling clock position being > 3σ ck . Figure 21 shows the plots of the mean time to absorption and the standard deviation of the time to absorption with the initial clock position as estimated by the Markov chain model, when σ ck = 20τ , i.e. standard deviation of the clock jitter is 2% of the clock period. 
Initial data position 
D. Modeling the effect of ISI and random jitter
The Markov chain model of the system can be easily extended to analyze the combined effect of data dependent and random jitter. An example case of jitter induced by 1 bit ISI and Gaussian distributed random jitter is shown in this section. When the jitter is solely due to 1 bit ISI, the data follows one of two distinct traces as discussed in Section IV-A.
When random jitter is also present, the zero crossing of the data spreads around these two zero crossing positions. Since the random jitter is independent of the data, the probability of zero crossing at any position is equal to the product of the contributions of each of the sources of jitter. For the purpose of analysis it assumed that the receiver clock is jitter free and the transmitter clock has all the random jitter and the channel introduces the data dependent jitter. Figure 22 shows the distribution of the data zero crossing positions and the sampling clock position when the data signal has both random jitter and 1 bit ISI induced jitter. The window of susceptibility is now widened to 3σ ck + W A−B + 3σ ck . The effect of the data jitter can be modeled as a shift in the mean of the gaussian random jitter. Hence, for random jitter along with jitter due to 1 bit ISI, the jitter at any time can be considered to be Gaussian distributed, with a standard deviation of σ ck and mean at either A or B. Consider an example clock position (t ck ) as shown in Figure 22 . Choosing A as the reference position for calculations, the probability of a data transition occurring to the left of the clock is given by
where Φ
x−µ σ is the Cumulative Distribution Function (CDF) of a general Gaussian distribution with mean µ and standard deviation σ. The probability of a data transition to the right of the clock can then be calculated as
where, P (N T ) is the probability of not having a data transition in that clock cycle.
The Markov chain and its transition probabilities were computed in this manner and the mean settling and the standard deviation of the settling time was analyzed. Figure 23 shows the mean absorption time and the standard deviation of the absorption time obtained for data with timing jitter induced by 1 bit ISI in addition to Gaussian distributed random jitter with standard deviation of 1% of the clock period.
V. QUANTIFICATION OF THE SETTLING TIME FOR A GIVEN CONFIDENCE LEVEL
Since the settling time is not deterministic, it is not possible to quantify the settling time absolutely. However, a bound on the settling time can be assured for a given confidence level. The settling time is maximum when the system starts with its initial clock in the center of the window susceptibility (for a system without any programmed bias). We shall estimate the settling time from this position as this is the worst case settling time.
Let P (0) be a row matrix representing the Markov chain's initial condition. In this case this matrix will contain a 1 at the entry corresponding to the center position and the rest of the entries as zeroes. The probability of the Markov chain being in any state after a transition can be calculated by multiplying this row matrix with the P T M of the Markov chain. Similarly, after n transitions the probability of the Markov chain being in any of the states is given by
In the presence of absorbing states, the probability of absorption after n iterations is simply the sum of all the entries of P (n) that correspond to the absorbing states. Hence, one can iteratively find the number of transitions required for absorption with a given confidence level.
For the Markov chain of the 1 bit ISI case, with a window of 40 steps, the probability of absorption with the number of transitions is calculated as described above and shown in Figure 24 This is actually the cumulative distribution function of the probability of absorption in a given number of transitions (may be less but not more). Hence, the probability of absorption in exactly #n cycles can be found by taking the derivative of the above graph. This is shown in Figure 25 . This shows that the distribution of absorption probability is a skewed distribution as one would expect. For number of transitions less than the distance to the absorbing state is always zero.
As seen from Figure 24 , for absorption with 99% surety, the Markov chain needs about 3000 state transitions. An analysis of the number of transitions needed for absorption with 99% confidence with the size of the window was done. Figure 26 shows the plot obtained from this analysis. As discussed earlier, the window size can be controlled by changing the gain of the loop. 
VI. TECHNIQUES FOR REDUCING THE SETTLING TIME
A. The coarse first synchronizer
The settling time of the circuit depends on the loop gain or the step size of the phase correction circuit. A large step size will mean that the size of the window of susceptibility (in term of number of steps) is small. This results in quick settling, as was discussed in the previous section. However, a large step size results in high jitter in the clock after lock is achieved. The synchronizer proposed in [4] , uses a coarse and a fine correction loop for accurate synchronization. In the designed circuit, the circuit first tries to achieve lock using the fine tuning loop. If lock cannot be achieved in the limited range of the fine tuning loop, a coarse correction is initiated. If the fine tuning loop is disabled during the initial settling period, it enables us to have a much larger step size, and hence a much faster settling. When a 10 phase DLL is used, and even if a horizontal eye opening of 50% is available, the size of W is only 5 steps. Figure 27 shows the block diagram of the coarse first synchronizer. Here, during the initial period after startup, a switch disconnects the loop filter capacitor and another set of switches convert the charge pump to a combinational circuit [21] , i.e. connecting the gate of the PMOS current source to GND and gate of the NMOS current sink to VDD. This essentially disables the fine tuning loop and ensures a coarse tuning step in every cycle of the divided clock that drives the coarse tuning loop.
The amount of time for which the coarse tuning loop should be run before enabling the fine tuning loop depends on the size of W and on the desired confidence for achieving lock within this period. In this example circuit, the coarse synchronizer used a 10 phase Delay Locked Loop (DLL). Assuming an extreme condition that the horizontal eye opening is only 50%, it means that the window size is 5 steps. From the analysis presented in the previous section, an absorption with 99% confidence needs 32 cycles. This corresponds to 128 ns in absolute time.
The coarse first synchronizer was designed and simulated for a channel with benign channel. The receiver input eye diagram for this test is shown in Figure 28 . The settling behaviour of this synchronizer is shown in Figure 29 , from an initial clock phase in the center of W. As seen in the circuit escapes the window W very fast and settles accurately once the fine tuning loop is enabled. The lower waveform is the select signal for switching from coarse first mode to normal mode, which is run for 160 ns in coarse mode. This signal can be generated using a power on reset circuit. The same synchronizer circuit was simulated with an input that had very high ISI. The receiver input eye diagram for this test is shown in Figure 30 . With this data input, the was initialized to the center of the window W. Even for this input, the circuit escapes the window W within the coarse first operating period. The noise in the control voltage is primarily due to the jitter in the incoming data.
B. Fine first synchronizer with bias to one of the absorbing states
Reduction of settling time using the coarse first technique is specific to the coarse+fine type synchronizer reported in [4] . Another, and more generat, technique of reducing the settling time is by introducing a mismatch in the relative strengths of the UP and DN updates in every cycle. In this technique, a deliberate relative mismatch in the strengths of the UP and DN updates is introduced. Using the 1 bit ISI model for illustration purposes, we discuss the settling time of the synchronizer as follows.
Figure 32(a) shows the mean time to absorption as a function of initial sampling phase, when the step sizes for the left and the right shifts are (i) equal and (ii) mismatched by 10%.
Note that a 10% mismatch in the step size of the left/right shift reduces the mean absorption time by up to 40%. The effect of this mismatch is more pronounced in the standard deviation of the absorption time as a small asymmetry is shown to result in a large reduction in the standard deviation. Figure 32(b) shows the plot of the standard deviation of the absorption time with and without the UP/DN mismatch. Hence, to reduce the settling time of the circuit, one can deliberately introduce a mismatch in the UP/DN strengths of the charge pump in the circuit. Alternately, a training sequence that generates a similar mismatch, by producing ISI biased towards one side of the window W, can reduce the settling time.
1) Reduction in settling time with charge pump mismatch: Figure 33 shows the reduction in the settling time of the circuit with the introduction of a 10% mismatch in the UP and the DN currents of the charge pump. Under identical initial conditions, a simulation with 10% mismatch showed >80% reduction in the time taken to exit the window W. Charge pump mismatch can result in increased jitter once the circuit has locked. Hence, if good jitter performance is desired, the introduced mismatch can be switched off after lock has been achieved.
2) Reduction in settling time with training sequence: A training sequence that biases the system to either one of the directions can be used to reduce the settling time. One such training sequence is "...0010011100100111...". This sequence has a minimum run length of 1 bit for a logic '1' and 2 bits for logic '0'. Hence, the ISI for logic '1' is always more than that for logic '0' and the phase detector's output in the window W is biased towards the right. Figure 34 shows the control voltage as the circuit settles, when the above training sequence is used as well as with random equiprobable binary sequence. Notice the considerable reduction in the settling time when the deliberately biased training sequence is used. One could also use an alternating 1 and 0 training sequence which reduces the jitter due to ISI to zero. However, random uncorrelated jitter between the transmitter and receiver clocks will still result in a non-zero size of the window W. 
VII. CONCLUSIONS
The effect of jitter on the settling time of mesochronous clock retiming circuits is discussed in this paper. It is shown how ISI induced jitter and random jitter can increase the settling time of clock recovery circuits indefinitely. A model of the system as a Markov chain with absorbing states is developed. Using Markov chain models, the effect of different types of jitter is analyzed. The model predictions of the settling time in terms of the mean absorption time of the Markov chain match well with behavioural simulations. Techniques for reducing the settling time are reported, which originate from the insights gained from the model. A coarse first synchronizer, that uses only coarse correction steps initially, is proposed. This architecture achieves quick settling, in the presence of substantial ISI. Another technique of reducing the settling time by introducing a mismatch between the phase updates in either direction of the clock retiming circuit is also discussed. This is applicable to phase interpolator based clock retiming circuits as well. This mismatch in the is achieved either by introducing a mismatch in the charge pump or by using appropriately designed training data. All the suggested fast settling synchronizers are verified with circuit simulations.
APPENDIX A MEAN TIME TO ABSORPTION OF A MARKOV CHAIN
Knowing the probability transition matrix of a Markov chain, the mean time to absorption and its variance can be computed. We will outline an example computation in this Appendix. The state diagram of the Markov chain representation of the clock recovery circuit, for data with 1 bit ISI, is shown in Fig. 13 . The states corresponding to −T W /2 and T W /2, which are at the edges of the window of susceptibility, are absorbing states. The probability transition matrix P for this Markov chain can be written as To calculate the mean time to absorption (and the variance of the time to absorption), P is first written in the canonical form [18] . This is obtained by reordering the entries in P to separately aggregate all the transient and absorbing states respectively. Hence, P can be written as 
