An analog adaptive decision-feedback equalizer (DFE) is described. The DFE cancels intersymbol interference using four feedback taps and a fifth tap cancels DC offset. The coefficient for each tap is adapted using a small mixed-signal integrator. The DFE dissipates 220 mW at a data rate of 150 Mb/s. The active area is 1.8 mm 2 in a 1mm CMOS process. 
I. Introduction
As the density in modern disk drives increases, the intersymbol interference (ISI) becomes more severe. As a result, sampling detectors have been used to combat the increased ISI, allowing reliable high-density hard-disk systems to be built [1] . In particular, sampling detectors that use partial-response maximum-likelihood (PRML) detection are widely used. Another sampling detector that can be used is the decision-feedback equalizer (DFE). These equalizers are commonly used in digital communications applications such as digital subscriber lines (DSL) and computer networks.
Recently, read channels that use an adaptive DFE have been reported. One such channel uses a RAM-DFE and does all the required computations in the digital domain [2, 3] . An analog DFE-based read channel was recently presented [4] . In this DFE, the ISI cancellation and the adaptive loops are realized with analog circuits. Conventional (or "linear") mixed-signal DFE's and a mixed-signal RAM-DFE have proven the feasibility and efficiency of mixed-signal implementations of DFE detectors [5] [6] [7] . These DFE's use digital integrators in the adaptive loops. This paper describes an adaptive analog DFE for the disk-drive read channel. The key difference in this DFE is that each coefficient is adapted using a mixed-signal integrator that is much smaller than the integrator used in two previous mixed-signal DFEs [5, 6] .
The paper is organized as follows. Section II presents background on the DFE-based read channel. The mixed-signal integrator is described in Section III. In Section IV, the architecture of the analog DFE and the key circuits are described. Finally, the measured results are summarized in Section V.
II. Background
Binary data is stored as a magnetization pattern on a hard disk. The orientation of the field is determined by the binary data. A read head responds to changes in the recorded magnetic field. A positive (or negative pulse) is generated each time there is a data transition from 0 to 1 (or 1 to 0). The response to a change in the magnetization pattern can be modeled by a Lorentzian function given by the equation
where PW 50 is the width of the pulse measured between the half amplitude levels [8] . The read signal is a superposition of these Lorentzian pulses. Typical recording densities have a PW 50 that is between 2T and 3T, where T is one bit period. The peak of the Lorentzian is referred to as the cursor location. Precursor ISI is due to nonzero samples of the read signal before the cursor.
Postcursor ISI is caused by nonzero samples of the read signal after the cursor. There is significant ISI because the pulse is much wider than the bit period T.
A DFE-based read channel that can remove ISI is shown in Figure 1 . The read signal is first amplified and then sent through a low-pass filter (LPF) to remove out-of-band noise. The forward equalizer is an analog filter that removes precursor ISI. The DFE consists of the feedback equalizer, summer, and slicer. The slicer makes binary decisions that are estimates of the recorded data on a disk. The output of the feedback equalizer is a linear combination of past decisions that cancel the postcursor ISI remaining after forward equalization. The feedback equalizer produces an analog signal for canceling postcursor ISI. By canceling the ISI in the analog domain, a front-end high-speed ADC is not required and a relatively large digital forward equalizer can be replaced by a smaller analog forward equalizer, which saves power and die area.
This paper describes a prototype analog DFE that includes the circuits shown in the dashed box of Figure 1 and the circuits to adapt each feedback equalizer coefficient. put, which could be from the analog forward equalizer output or the analog circuits in the DFE [9] . After each decision, an error is computed as the difference between the slicer output and input. The error is then quantized to one bit to simplify the adaptation hardware. Each coefficient is adapted using a discrete-time integrator which implements the sign-error least-mean-square (LMS) algorithm [10] . The postcursor ISI is cancelled by adapting coefficients c 1 -c 4 using the
where n is the discrete-time index. Coefficient c 0 is adapted using
Only the integrator for adapting coefficient c 1 is shown in Figure 2 factor m » 1/1000 is needed to keep the mean-squared error (MSE) small. The MSE is defined as
which is the mean-squared value of the slicer error. A typical design goal is to keep the noisefree MSE below -25 dB. This value is a measure of the uncancelled ISI.
III. Mixed-Signal Integrator
Discrete-time integrators that can be used to adapt each coefficient are shown in Figure 3 .
In a previous mixed-signal DFE implementation, a 10-b Up/Down (U/D) counter followed by a 6-b DAC was used to build each discrete-time integrator, as shown in Figure 3 (a) [5] . The integer input is a binary signal with value ±1. This signal determines whether the counter should increment or decrement its current value according to the update equations in ( Here, power dissipation and die area are saved by replacing the 6-b counter and DAC with a single analog integrator. However, there is one problem with the mixed-signal integrator shown in Figure 3 (c). Input offset from the analog integrator causes a nonzero input-referred offset at the pre-counter input. Nonzero integrator offsets increase the MSE, which degrades performance [11] .
An improved version of the mixed-signal integrator that has a lower input-referred offset is shown in Figure 4 . The analog integrator is implemented as a charge pump [12] , which is shown here as two ideal current sources 
where Dt is the time that either switch is closed in a bit period, _ I is the average of I 1 and I 2 , and
The term in (4) that includes U-D is the desired input to the analog integrator and can be either +1, -1 or 0. The U+D term is the input-referred offset of the charge pump. Ideally, I 1 = I 2 and I off is zero. However, when the currents I 1 and I 2 are mismatched, the offset term is nonzero and the coefficients are not optimal. This offset affects coefficient c k only if U=Carry or D=Borrow goes high. A change was made to the pre-counter to eliminate frequent sequences of Carry=1 or Borrow=1, which can occur if the count value of the pre-counter settles near positive or negative full-scale and toggles between the two values due to noise at the input of the precounter. This causes the charge-pump offset to be frequently integrated since the U and D switches are toggled often. A reset is added to the pre-counter to reduce the accumulation of the offset. When the counter reaches its maximum value (+7) or its minimum value (-7), the precounter is reset to its mid-scale value, zero. With this change, the 4-b count value cannot toggle between its maximum and minimum values on consecutive clock cycles. As a result, the charge pump will spend most of the time with both switches open. Either the U or D switch can be enabled at most once every 7 clock cycles.
To refer the charge-pump offset to the counter input, the average gain of the pre-counter must be determined. For a pre-counter input with a positive DC value, the U-D output of the precounter will be +1 on average 1/7 as often as the pre-counter input because the pre-counter must count up 7 times before it overflows and outputs Carry = 1 after it has been reset to mid-scale.
Similarly, for a negative DC input, the pre-counter must count down 7 times before it underflows and outputs Borrow = 1 and U-D = -1. As a result, the pre-counter can be viewed as having a DC gain of 1/7.
The offset at the pre-counter input is given by the equation
Here, I off /2 _ I is the normalized input offset of the charge pump. To refer this offset to the digital input, it has been divided by the DC gain of the pre-counter and multiplied by the frequency of U=1 or D=1 since the charge-pump offset is only accumulated under those two conditions. Here, the frequency of U=1 or D=1 is defined as the number of occurrences of U=1 or D=1 divided by the total number of samples. In steady-state, negative feedback in each adaptive loop causes the DC component at the pre-counter input to have an average value near zero and the resulting frequency of U=1 or D=1 is about once every 50 samples, based on simulation. From (5), this gives an input-referred offset at the digital input of
So, the offset at the pre-counter input is about 1/7 of the normalized charge pump offset. Therefore, the pre-counter with reset has a DC gain of 1/7 and gives a low input-referred offset for the entire mixed-signal integrator.
The choice of 4 counter bits was determined by simulation. Figure 5 shows the MSE for various B-bit pre-counters. The loop gain is reduced by using more counters bits and this results in a lower MSE. However, the performance is limited by the number of DFE coefficients when a large number of counter bits is used. Based on these results, a good choice for this DFE is a 4-b pre-counter since there is little performance gain from using more bits. 
IV. Circuit Design
A block diagram of the analog DFE prototype is shown in Figure 6 . All analog signals (thin lines) are fully differential. The input from the forward equalizer is a current which is summed with the output of the feedback equalizer by simply tying these lines together. Currentoutput multipliers form the product of the coefficient c k and the delayed binary data â[n-k]. They are implemented using switched transconductors [5] . The resulting current is converted to a voltage by a transresistance (or i2v) stage. A 2-b flash ADC generates the decision â and the 1-b error ê. The ADC consists of 3 pre-amplifiers, 3 comparators and some decoding logic. The preamplifiers and comparators are similar to the ones used in [7] . Only the adaptive loop for coefficient c 1 is shown in the figure, however there are five mixed-signal integrators on the prototype to adapt the coefficients c 0 -c 4 .
A simplified schematic of the charge pump is shown in Figure 7 . A die photograph of the analog DFE is shown in Figure 9 . The IC was fabricated in a 1mm single-poly double-metal CMOS process. The core area is 1.1 mm x 1.6 mm. The entire chip including pads is 2.3 mm x 3.1 mm. Figure 10 shows the test setup. The test signal was generated on a workstation by taking random binary data, differencing it, and then convolving the result with a Lorentzian with PW 50 =2T. The resulting waveform models a read signal. The precursor ISI is removed by a 5-tap FIR forward equalizer. The result is loaded into an arbitrary waveform generator. Its output is summed with band-limited white Gaussian noise and then fed into the analog DFE. The decisions from the DFE are compared with the correct decisions, which are the known data used to generate the test signal. A logic analyzer counts the bit errors. 25 test chips were fabricated and all were fully functional. The following data is from one of those chips.
V. Measured Results
Shown in Figure 11 is the equalized slicer input at 10 Mb/s after the coefficients have adapted without noise added to the DFE input. Here, the output of the transresistance stage is directly routed off-chip through a MOS switch, so the DFE had to be operated at a low clock speed to view this signal. This output path is disabled during normal operation. In this plot, the slicer input has been normalized to ±1. The actual desired slicer input voltage is ±800 mV. Here, the measured MSE is -27.8 dB. A typical goal is to keep the MSE below -25 dB.
Shown in Figure 12 is the measured bit-error rate (BER) performance of the analog DFE during steady-state operation. These test results are for a Lorentzian channel with PW 50 =2T. The graph shows the measured bit-error rate versus SNR at the DFE input. The solid line plots the simulated BER of an ideal slicer with no ISI at its input, which is a lower bound for a DFE.
However, this lower bound does not take into account error propagation which degrades performance by about 0.25 dB at a SNR of 15 dB. The dashed line with circles plots the measured BER at 100 Mb/s. Here, the DFE is performing within 0.3 dB of the lower bound for SNR < 15 dB. The dotted line with triangles plots the measured BER at 150 Mb/s. Here, the DFE is performing within 0.9 dB of the lower bound for SNR < 15 dB.
The two plots in Figure 13 show the measured differential coefficient voltages during steady-state operation. Shown in Figure 13 (a) are the coefficient voltages when the 4-b precounters are bypassed. Here, the integrator consists only of the charge pump. Shown in Figure   13 (b) are the coefficient voltages in steady-state when the entire mixed-signal integrator is used.
Here, the measured variance is reduced by about a factor of 7 when compared to the previous case. This is due to the lower effective gain when using the pre-counter.
The measured plots in Figure 14 show coefficients Thus, the adaptation process takes approximately 1200 bit periods, which is about a factor of 7 increase in convergence time when compared to Figure 14(a) . In a disk-drive system, a short convergence time is desired and can be achieved by bypassing the 4-b pre-counters initially to
give fast convergence, then switching in the 4-b pre-counters to reduce the coefficient variance in steady-state operation. Table 1 provides a performance summary for this analog DFE. It also compares the performance of the analog DFE with a similar mixed-signal DFE [5] and a mixed-signal RAM-DFE [7] . Comparison with the DFE in the second column is complicated by the fact that it was fabricated in a different, 2mm CMOS process. Applying a simple scale factor of 1/4 to the area of that DFE (1/4 = (1 mm/2 mm) 2 ), the analog DFE described in this paper would be about 1/3 the size of a scaled implementaion of the previous mixed-signal DFE in [5] . A RAM-DFE differs significantly from a "linear" DFE in that the RAM-DFE can cancel non-linear ISI and it has only one integrator for adaption. However, our analog DFE and the mixed-signal RAM-DFE were fabricated in the same 1 mm CMOS process. The analog DFE occupies less than half the die area as the mixed-signal RAM-DFE. Also, it achieves a higher data rate and dissipates less power.
The BER performance of the our DFE is comparable to the other DFE's when operating at 100
Mb/s. This comparison shows that our DFE that uses mixed-signal integrators achieves performance that compares favorably with other mixed-signal DFEs but uses less IC area.
VI. Conclusion
An analog DFE that uses a small mixed-signal integrator to adapt each coefficient has been presented. This analog DFE can be implemented in a digital CMOS process since a linear capacitor is not required. When this DFE is used with an analog forward equalizer, a high-speed ADC and digital equalizers are not needed, saving area and power. The use of the mixed-signal integrators reduces the die area as well as power dissipation when compared to previous work [2] [3] [4] [5] [6] [7] . The performance of the analog DFE, in terms of BER, is comparable to previous DFE implementations. This DFE may be of interest in disk drives and in other high-speed digital communications applications. Figure 1 Block diagram of a DFE-based read channel. 
