I. INTRODUCTION
T HE CONTINUOUS-TIME filters have been widely used in various high speed applications, such as high data-rate read channel hard disks, wireline and wireless communications. The 3 dB cutoff frequency of the designated low-pass filter needs to increase with the speed requirement of the applications, but the high performance constraints, such as low power dissipation, small area, high linearity, and flat group delay, are not loosed for any bit. The topology with simplicity, modularity, open-loop configuration, and electronic tunability would be the conspicuous choice for high frequency filter design.
The operational transconductance amplifier (OTA) is the main building block in the filter topology [1] - [3] . The key function of the OTA is to convert the input voltage into the output current while accuracy and linearity are both maintained. However, the nonidealties of the OTA dominate the filter performance. The parasitic capacitors produce deviation of 3 dB cutoff frequency of low-pass filters, the finite output impedance affects the quality factor, and the voltage-to-current conversion affects the filter linearity. Many previous works have been proposed [4] , [5] , but those highly linear circuits are difficult to be used when high speed is required.
In this paper, a 1 GHz fourth-order equiripple linear-phase low-pass filter, based on a high performance OTA, with an automatic tuning circuit is presented. The target of the filter is for fast read channel storage system. Section II develops the high speed OTA based on the Nauta's inverter structure [6] . Owing to the pseudo-differential structure, the suitable common-mode control system should be included. In Section III, the proposed filter is designed by cascading two biquadratic filters. A modified automatic tuning circuit designed to suppress the effects caused by the fabricated corner variation and temperature is discussed in Section III. The measurement results are shown in Section IV. Finally, conclusions are presented in Section V.
II. OPERATIONAL TRANSCONDUCTANCE AMPLIFIER
In this section, the proposed OTA circuit is discussed. Fig. 1 shows a simplified diagram of the proposed OTA. The block diagram is composed by the symbol of the inverter. The OTA circuit can be divided into four parts. In this diagram, voltage-to-current conversion is composed by inverters I1 and I2. Common-mode feedforward (CMFF) and common-mode feedback (CMFB) control are composed by inverters to and , respectively. Gain-enhancement is achieved by inverters and . Detailed concepts are analyzed in the subsections.
A. The Voltage-to-Current Conversion
The circuit diagram of the class-AB OTA is shown in Fig. 2 . The voltage-to-current conversion circuit is composed by transistors M1 to M4, where the device parameter of M1 is equal to M3 and M2 is equal to M4. These transistors would operate in the saturation region, and the signals applied to the gate terminal are , where is the input common-mode voltage and is the input differential voltage. By using the square law equation of saturated transistors, the voltage-to-current conversion can be analyzed and the transconductance can be expressed as (1) where is the supply voltage, is the threshold voltage of the nMOS transistor, is the threshold voltage of the pMOS transistor, and is the device parameter of transistor . Thus, the transconductance will be related to device parameter, supply voltage, and threshold voltage. We should note that when large device sizes are used, the OTA is linear even while is not equal to . For the circuit to operate at very high frequency, absence of internal nodes would be significant to avoid the effect caused by the parasitic capacitance. We can find that the simple voltage-tocurrent conversion composed by transistors M1-M4 operates with no internal nodes. Thus, the only parasitic capacitance existing in the signal path is resulted from the transistor channel, and the pole would locate at the tens of GHz range. In addition, a large transconductance should be designed because the transconductance would be proportional to the 3 dB cutoff frequency of the low-pass topology. With small feature sizes of nano-scale CMOS technology, the drain current of single MOS transistor can be approximated as (2) where is the mobility reduction coefficient. If we assume that is equal to , we can define . From the analysis of a Taylor series expansion, the third-order harmonic distortion term would be the dominant component of the OTA, and the HD3 would be given by (3) Thus, the linearity can be improved by giving a larger overdrive voltage, which requires a higher supply voltage or a smaller threshold voltage. Usually, half the value of supply voltage should be chosen for input common-mode voltage when is equal to . In the situation, the output voltage would be close to half of the supply voltage, and thus no output common-mode current would appear.
The device mismatch will cause the second-order effects of the voltage-to-current conversion, and thus the linearity performance is degraded. In simulation, less than 45 dB distortion can be easily achieved with 1% device mismatch. Furthermore, common-centroid layout has been used in the OTA to maintain the performance.
For the required large transconductance, the thermal noise, which dominates noise performance for high speed circuit, would be reduced as well. The PSRR of the circuit depends on the gain performance of the OTA, and a signal transfer response is measured. In this circuit, the power supply rejection is 35 dB at low frequency. Besides, two tone signals can be used to obtain the IM2 performance. In our simulation, the value of less then 55 dB can be easily achieved.
B. Common-Mode Control System
The OTA behaves as a pseudo-differential structure and thus a CMFF circuit should be used to restrict the effect caused by the variation of the input common-mode signal. Transistors M9 and M14 have one half of the device parameter of transistor M1, and transistors M10 and M13 have one half of the device parameter of transistor M2. The input common-mode signal can be obtained by using transistors M9, M10, M13, and M14. Then, the quantity of the input common-mode variation would be cancelled out at output nodes through transistors M11 and M12.
Owing to the cascading structure of the topology, the output nodes of one OTA would be the input nodes of the following OTA. The output common-mode voltage should be fixed to the value of the input common-mode voltage, and thus the linearity of the designed filter would be hold. In our circuit, the output common-mode voltage is maintained by an adaptive CMFB circuit, which includes transistors MFB1 and MFB2. No additional common-mode sensing circuit would be required because the information of the output common-mode voltage appears at the next OTA stage. Thus, the output common-mode information would be detected by the voltage Vcnext, which is the Vc node of the next OTA stage. Then, the signal produced by the CMFB circuit would be combined with the CMFF circuit to adjust the output common-mode voltage accordingly.
In this control system, the CMFF topology forwards the input signal and does signal cancellation in the current domain. It will not amplify the signal and no stability issues will occur. On the other hand, the stability issue is important for the CMFB circuit. To check the stability of the CMFB loop, we should break the feedback loop and check the phase margin. The openloop gain can be derived by , where is the transconductance of MFB1 and is the output impedance of the OTA. The dominant pole is given by , where is the loading capacitance and it is about 0.9 pF in our design. The non-dominant pole is determined by , where is the capacitance seen from node Vc, which is small compared to the loading capacitor, and and are the transconduacance of transistors M11 and M12. For the open-loop simulation of the common-mode system, a large inductor is placed in the loop in order to open the small signal path. In the system, Vc node would introduce the nondominant pole, and a phase margin of 53 can be obtained while the loading capacitor of 0.9 pF is given.
C. Gain Enhancement and Transconductance Tuning Scheme
In the topology, the gain performance should be taken into account as well. The gain enhancement stage is restricted due to the requirement of no any internal nodes in the high speed design. Moreover, the small feature size, which is chosen for small parasitic capacitance, also degrades the gain performance. The negative resistance circuit for gain enhancement, composed by transistors M15 to M18, is shown in Fig. 3(a) . With the addition of Fig. 3(a) , the fabricated dc gain of larger than 35 dB could be achieved. Besides, the channel length modulation effect, which is a distortion source contributed to the proposed circuit, can be minimized.
Since the required value of the negative resistance depends on the inverter output resistance, the device mismatch issue may degrade the gain of the OTA and also the pass-band gain of the filter. In addition to careful circuit layout, a trimmed forward bias voltage at the bulk of the positive feedback devices can further maintain the gain performance. Besides, the power supply rejection is proportional to the gain of the OTA, and the gain enhancement circuit provides a higher rejection performance.
Without the existence of internal nodes, the possible transconductance tuning nodes left are the supply voltage and the bulks. In [6] , the transconductance was tuned by adjusting the supply voltage. However, the method not only degrades the linearity when fixed common-mode voltage is applied from the previous stage of the system, but also increases the complexity of the regulator due to class-AB operation. Besides, the voltage supplied by the regulator should operate within a specific range for transconductance tuning. Since the ability to provide large current and low noise should be maintained at the same time, a high performance regulator is required. Thus, in this paper, the transconductance is tuned by adjusting both the bulk voltage of pMOS and nMOS in the deep-NWELL CMOS process. Fig. 3(b) shows the bulk tuning circuitry. When the voltage at is changed, the voltage at nodes and would be adjusted to opposite values accordingly. The forward bias scheme would decrease the values of and , and then the transconductance of the proposed OTA would be dependent on the value of . There are some advantages of the forward bias scheme. First, the speed can be enhanced to a higher value while the increased power consumption is less than increasing the voltage. Second, the variation of threshold voltage becomes smaller because forward bias shrinks the depletion layers of MOS transistors [7] . Therefore, the short channel effect can be reduced. Finally, the overdrive voltage becomes large under this condition, and the linearity of the OTA could be further improved.
However, the latch-up effect and leakage current would be the problems, and thus the constraint of a 0.5 V forward bias in deep N-WELL process should be maintained [8] . Thus, we can design the device aspect ratios of transistors MT7 and MT8 by the following constraint:
where and is the low-field mobility of nMOS and pMOS transistors. We should note that transistors MT7 and MT8 would operate in the weak inversion region in the circuit when a smaller forward bias voltage is applied.
In the simulation, the threshold voltage becomes larger at the slow corner case, and thus the filter operates at a lower cutoff frequency. Since the tuning circuit will tend to increase the speed of the filter, is increased. On the opposite, is decreased at the fast corner case. Therefore, large threshold voltage implies larger gate overdrive voltage of transistors MT7 and MT8, and small threshold voltage implies smaller gate overdrive voltage. To design this circuit, the device parameter should meet the requirement from (4) and (5) at first. Then, the cases of the corner conditions and temperature variations should be simulated. We should make sure the gate-to-source voltage of transistors MT7 and MT8 vary within limited threshold range when is adjusted from ground to supply voltage. Under this condition, a large aspect ratio is chosen in this design.
In addition, the value of the negative resistance for gain enhancement could be tuned separately by applying another bulk tuning circuitry, and thus the tuning can be also achieved.
By applying the forward bias to the bulk terminal, the tuning range of the transconductance is 25% in this design. In the topology, the cutoff frequency of the filter is proportional to the unity gain frequency of the OTA, and thus the filter tuning range is determined by the same value. Process corner variation from slow-slow to fast-fast cases and temperature variation between 20 C to 100 C are included in simulation. Simulation results show that the deviation of the filter cutoff frequency can be covered by the tuning scheme. In the OTA, the transconductance shown in (1) implies that supply variation can directly affect the filter cutoff frequency. From simulation, the tuning scheme can cover 13% of the supply variation, and this value can be easily maintained by using a simple low dropout regulator.
D. Comparison With Nauta's Structure
In Nauta's structure [6] , four inverters are used to maintain input common-mode rejection and enhance output resistance. The output voltage is self-biased and is dependent on the supply and process variation. In our circuit, four inverters are used in the CMFF circuit to maintain input common-mode rejection. Additional two inverters are used to perform positive feedback loop. We should note that the two inverters, which only occupy a small area, are required to compensate for the output conductance. The input common-mode rejection and the output conductance can be controlled separately. We can easily maintain the gain performance as the device length is scaled down, and the cost of the topology is additional 3% current consumption. The CMFB circuit is composed by one inverter. It can be used to maintain output common-mode voltage and thus the linearity can be further enhanced. On the other hand, if we take the proposed tuning circuit into consideration, this circuit has consumed less power and area. This is because a high performance regulator is required in Nauta's structure, and the inverter under forward bias consumes less power in this work.
III. FILTER ARCHITECTURE AND AUTOMATIC TUNING CIRCUIT
The architecture of the fourth-order equiripple linear phase filter is shown in Fig. 4 . The filter is designed based on the cascade of two biquadratic filters. The LC ladder structure is chosen due to its low sensitivity. A constant group delay should be maintained to avoid detection problems in the frequency band where the spectral components of the signal are located. Therefore, an equiripple transfer prototype is designed for the filter. In the structure, each OTA has its individual CMFF circuit. The number of CMFB circuits is reduced due to the sharing of the same output nodes. In order to check the stability of the biquadratic section, a current pulse is given at output nodes and we can find that the 1% settling-time is less than 0.5 ns. The parasitic capacitors result in deviation of the cutoff frequency and the effect becomes prominent especially for the target of gigahertz application. Therefore, the integration capacitance must be designed by taking the transistor gate capacitance, junction capacitance, and additional MIM capacitance into consideration.
Since the transconductance and the capacitance would change with process and temperature variations, an automatic tuning scheme should be used to maintain the time constant in this low filter design. The indirect tuning, which takes the advantage of the less complexity and smaller area than the direct tuning, is used here. Fig. 5 shows the proposed master-slave tuning strategy. The tuning strategy is composed by the OTAs, squarers, and a comparator. In the figure, is a replica of the OTA in the proposed filter and the same load condition of should be applied. By applying a reference signal with the reference frequency of , which can be given by (6) and defining gmi as the transconductance of , the integrator output voltage becomes (7) where is the unity-gain frequency of the integrator and is given by (8) Then, the scheme utilizes the magnitude detection and the error current signal is generated based on the difference of the following two magnitudes:
Finally, a following low-pass filter would be used to filter out the high frequency components. When OTA1 and OTA2 have the transconductance ratio of , the frequency of the reference signal can be relaxed by the same ratio.
When the speed of filter increases to the gigahertz range, the deviation of the filter cutoff frequency occurs owing to limited circuit precision. However, since the tuning circuit proposed in the manuscript can operate at a slower speed for the same target of cutoff frequency, a high precision squarer and accurate tuning operation can be easily obtained. Therefore, we can make the selection of the reference frequency flexible, rather than the only choice of the filter cutoff frequency in [9] . Besides, low frequency reference signal can be used to relax the high speed requirement of the tuning circuitry for our high speed filter, and the process corner variation would not affect the ratio of largely. The control signal for correct transconductance is also applied to the slave integrator which matches the proposed master filter.
IV. MEASUREMENT RESULTS
The filter was fabricated by the TSMC 0.18-m CMOS process and measured with a 1.5-V supply voltage. The on-chip input and output buffers are used for the high frequency measurement. Fig. 6 shows the measured magnitude response of the proposed filter. The cutoff frequency is set to 1 GHz by the automatic tuning circuit. The transfer curve is accurate when operation frequency is less than 2 GHz. At high frequency, the effect caused by PCB board, package and bounding wires affect the filter transfer function, and thus the deviation of the curve occurs. Fig. 7 shows the measured group delay characteristics. It shows that the deviation of the group delay is within the range of 200 ps up to 1.5 fc. Fig. 8 shows the 43 dB of IM3 at filter cutoff frequency with 4 dBm two tone signals of 0.995 and 1.005 GHz. The measured CMRR of the filter is 32 dB, and the dynamic range of 39 dB is measured at 43 dB IM3 performance. The measured noise spectrum is 147 dBm/Hz at filter cutoff frequency. The noise is provided by the combination of the filter circuit, output buffer, and PCB board. Integrating the input referred-noise over the filter operation frequency, the signal-to-noise ratio is given by about 40 dB at 40 dB IM3 input signal. The filter and the automatic tuning circuit together dissipate 175 mW. The chip micrograph with the active area of 1 mm is shown in Fig. 9 .
To compare the proposed filter with the previous researches, the figure of merit (FOM) defined in [10] is introduced. The FOM, which takes the boosting factor, the speed of filter, technology feature size, and power per pole quantity into consideration, is given by (11) where is the boosting factor. The boosting factor is assumed to be 1 for no-boosting structure and 1.5 if the reported filter has boosting. The filter results are compared with previous realization in Table I , and FOM shows the high performance operation of the proposed filter.
V. CONCLUSION
In this paper, a high-speed OTA based on the inverter structure is realized. The combined CMFF and CMFB circuit ensures the input/output common-mode stability. The gain performance could be maintained by combining an equivalent negative resistor circuit at the output nodes. The forward bias scheme not only solves the problem of transconductance tuning but also improves the circuit linearity. This is the first time to present this tuning scheme in the filter signal processing. The OTA is used to design a 1 GHz fourth-order equiripple linear phase filter. An automatic tuning circuit which relaxes the need of high speed operation of the squarers and comparators is introduced. The theoretical properties of the proposed filter are experimentally verified.
