Multiplier and divider circuits are usually required in the fields of analog signal processing and parallel-computing neural or fuzzy systems. In particular, this paper focuses on the hardware implementation of fuzzy controllers, where the divider circuit is usually the bottleneck. Multiplier/ divider circuits can be implemented with a combination of A/D-D/A converters. An efficient design based on current-mode data converters is presented herein. Continuous-time algorithmic converters are chosen to reduce the control circuitry and to obtain a modular design based on a cascade of bit cells. Several circuit structures to implement these cells are presented and discussed. The one that is selected enables a better trade-off speed/power than others previously reported in the literature while maintaining a low area occupation. The resulting multiplier/divider circuit offers a low voltage operation, provides the division result in both analog and digital formats, and it is suitable for applications of low or middle resolution (up to 9 bits) like applications to fuzzy controllers. The analysis is illustrated with Hspice simulations and experimental results from a CMOS multiplier/ divider prototype with 5-bit resolution. Experimental results from a CMOS current-mode fuzzy controller chip that contains the proposed design are also included.
chips described in [1] [2] . These structures can be realized with CMOS technology by using MOS transistors working in subthreshold region. However, the low current levels make them operate slowly [10] . Low operating speed is also the main limitation of multiplier/dividers based on the time-division technique [11] . The MOS translinear principle is another approach. In this case, the main limitation is a low resolution because the performance of the resulting circuits is very sensitive to deviations from the simple square-law model of the transistors in saturation, caused by length-channel modulation, mobility reduction or mismatching [3] [12] [13] . Another technique widely employed is to invert the behavior of a multiplier by using local feedback or by inserting it in the feedback path of an inverting amplifier. Performance of these multiplier/dividers depends primarily on the performance of the multiplier and the amplifier employed. Precision is mainly limited by offsets associated with the input and output variables [3] [14] [15] . The analog fuzzy chips reported in [4] [5] [6] [7] contain this type of multiplier/dividers.
An alternative solution is the successive-approximation technique that employs two accumulators, as shown in Figure 1 [16] . The master accumulator approximates the value of the input signal, x num , after N steps: (1) where x den is a reference signal and b i is a digital output whose value is "1" ("0") if the output of the master accumulator at the i-th step is smaller (bigger) than x num .
The slave accumulator takes the set {b i } and another reference signal, x p , as inputs and provides the following output: (2) Solving equations (1) and (2) , it follows that: (3) with a quantization error of ± x den / 2 (N+1) .
Hence, the output {b i } is the digital code of the division x num /x den while x out is the analog discrete signal that represents the multiplication/division x p •x num /x den .
This technique has been employed in [17] to implement a voltage-mode multiplier/divider circuit, using A/D and D/A switched-capacitor (SC) cyclic converters as master and slave accumu-
x num x den -----------⋅ ≅ lators, respectively. In that proposal, the analog output is provided after two stages. The first of them consists of N steps in which the N bits of the digital output are obtained. The digital code is converted to analog format during the next stage which again consists of N steps. SC successiveapproximation multiplier/dividers have been employed in the fuzzy controllers described in [8] [9] .
The CMOS multiplier/divider circuit proposed in this paper implements the successive-approximation technique with current-mode algorithmic converters that occupy smaller area and are capable of working at lower voltage supplies than their voltage-mode counterparts. The A/D and D/A converters employed are coupled in the sense that they begin conversion by the same bit (the most significant bit, MSB) so that the analog and digital outputs are provided after only one stage.
Besides, the response time can be smaller than N times the duration of the slowest operation because continuous-time data converters are employed. Several circuit structures to design these current-mode continuous-time algorithmic converters are proposed in this paper, following the work in [20] [21] [22] [23] [24] . They are discussed and compared in terms of static and dynamic features in Section II. The most efficient of these structures enables realization of small and fast multiplier/divider circuits suitable for applications of low or middle resolution (below 9 bits, which is higher than the resolution obtained by many of the above commented proposals). Performance of the proposed multiplier/divider circuit is illustrated with Hspice simulation and has been verified with a 2.4-µm CMOS prototype with 5-bit resolution. These results are included in Section III. Section IV illustrates with experimental results the application of the proposed design to implement the divider in a current-mode CMOS fuzzy controller chip. Finally, Section V provides some conclusions of the work.
II.-Circuit design.
Among the different designs of current-mode converters, algorithmic types are widely used because they are very simple and modular. If switched-current techniques are employed, an iterative cyclic or a pipeline design can be implemented. In the first case, a cell performs the conversion of all the N bits sequentially so that the response time and throughput of the converter is N clock cycles. In the second case, an extreme realization is to employ a cascade of N bit cells that provide a bit every clock cycle so that the response time is also N clock cycles but the throughput is only one [23] [24] . If continuous-time techniques are employed, the data converter also consists of a cascade of N bit cells but no control circuitry is required to govern the signal transmission from one cell to another [20] . Besides, the response time can be smaller than N times the duration of the slowest operation. A disadvantage is that the converters are sensitive to mismatching among the cells, so that resolution is limited typically to 9 bits [25] .
Considering applications of low or middle resolution like IC realizations of fuzzy controllers, we have selected continuous-time algorithmic converters to reduce the control circuitry and the response time. The resulting multiplier/divider circuit has the modular structure shown in Figure 2 .
Each bit cell contains an A/D and a D/A bit cell that provide, respectively, a bit of the digital output, {b i }, and a contribution to the analog output, I out . As described in Section I, the output {b i } is the digital code of the current-mode division I num /I den while I out is the discrete current that represents the multiplication/division I p 
II.a Design considerations for the A/D bit cells.
Most reported algorithmic A/D converters [18] [19] [20] [21] [22] implement the multiplying algorithmic conversion. However, the dividing algorithmic conversion (illustrated in Figure 3 ) is more suitable for a divider circuit in order to have I den as the full-scale current of the A/D converter (master accumulator) and I den /2 N as the quantization step (refer to Equation (1)). Focused on this case, the input range for I num is I num ≤ I den .
Using a current-mode approach, the operations required by an algorithmic A/D conversion are performed by current mirrors and current comparators. Regarding accuracy, the limit is imposed by mismatching in the current mirrors due to systematic and random errors. Random errors are caused mainly by threshold voltage variations between ideally identical transistors, and can be reduced by employing large devices (with a typical limit of 9 bits) [25] . Systematic errors may decrease this potential resolution. On one hand, they are caused by a difference between the input and output voltages of the current mirrors, and on the other, by their finite output resistance. Regarding operation speed, this is limited by the transient response of the current mirrors and by the settling time of the current comparators.
In the following, we propose and discuss three structures of A/D bit cells that combine different circuit techniques to improve resolution and speed. The starting point is the dividing version of the original multiplying-type A/D bit cell proposed in [20] . This cell, which will be named type_0 cell, is illustrated in Figure 4 . Its resolution is limited mainly by the systematic errors that appear at the subtraction operation, R (I i ) -R (I den /2 i ), where R(.) represents the replication operation implemented by a current-mirror. To reduce this problem, which is caused by the high voltage swing at node x, the proposal in [20] is to use active current mirrors, at the cost of more area and power consumption. On the other side, the speed of this type_0 cell is limited mainly by the re-sponse time of the current comparator, which is based on a cascade of inverters like that described in [26] .
Type 1 cells:
The first modification we propose to improve the performance of the type_0 cell is the type_1 cell which is illustrated in Figure 5 . It implements the flow chart of Figure 3 , so that its output current I oi is given by: (4) where the bit b i is obtained from the comparison of R (I i ) with R (I den /2 i ).
The most important systematic errors, which appear at the subtraction operation, R (I i ) -R (I den /2 i ), are reduced with the regulated output stage of the mirrors that replicate I i and I den /2 i . This output stage is also exploited to perform as a fast current comparator like that proposed in [27] [28].
Since the input to this comparator is capacitive like in the type_0 cells, the offset is virtually zero and resolution is not degraded. Hence, an A/D converter based on these cells can occupy less area than another based on type_0 cells for a given resolution and range of operation.
Another advantage of type_1 cells is that their speed is higher since the comparator employed is faster. The cost to pay for it is a higher power consumption because one of the inverters in the current comparator acts as an amplifier and consequently, the quiescent current through it is not null. As usual, the higher the quiescent current the higher the speed. The fast response of the comparator and the capacitive coupling between nodes z and x may cause glitches in the digital output.
Let us analyze this problem in the bit cell shown in Figure 5 . When b i is "1", the pmos transistor M p that acts as a switch is off, the voltage at node x is high (M 3 is off) and the voltage at node y is low. When b i changes to "0", (R (I i ) < R (I den /2 i )), the pmos switch begins to conduct and the voltage at the high-impedance node x decreases rapidly following the voltage at the low impedance node y. The drain-to-gate capacitance, C gd , of transistor M 3 and the gate-to-source capacitance of transistors M 1 , M 2 , and M 3 , C gs , cause a capacitive coupling between nodes z and x so that:
The voltage decrease induced at node z can make R (I i ) be momentarily superior to R (I den / 2 i ). Hence, b i can return to "1" if the comparator is too fast. In this sense, the type_0 cells can be seen as a conservative design because they employ slow comparators.
Glitches have to be reduced if high-speed operation is required. A solution is to include the technique to dividing-algorithmic converters. The resulting flow chart and, consequently, the resulting bit cells, which will be named type_2 cells, are simpler than that obtained for multiplying converters, as can be seen in Figure 7 . The schematic of this type_2 cell, which also employs a fast current comparator, is shown in Figure 8 . Its output current I oi is given by:
where the bit b i is obtained by comparing R (I i ) with R (I den /2 i ) + R (I b ).
The design of these cells has to be more careful than that of type_1 cells to achieve the same resolution. The reason is that the bits b i are obtained by comparing currents that suffer more replicas and, hence, they are susceptible of more errors.
The most significant glitches of these cells appear when the nmos transistor that acts as a switch (M n in Figure 8 ) begins to conduct. However, they are not so important as they were in the not improved type_1 cells because the difference between the voltages at nodes x and y when the switch is open are not so high. Consequently, the operation speed of these cells can be similar (slightly inferior) to that of type_1 cells.
Type 3 cells:
Another way to increase the speed of the current mirrors in the bit cells is to always ensure a not zero current through them by modifying the conversion algorithm. The flow chart illustrated in Figure 9a is the proposal in [22] for multiplying A/D converters. Applying this idea to a dividing A/D converter, we have obtained a simpler flow chart which is shown in Figure   I oi
. It ensures that the output current I oi of each cell is always bigger or equal to I den -I den /2 i . Figure   10 illustrates the type_3 cell that implements this algorithm. The expression of I oi is the following:
where b i results from comparing R (I i ) with R (I den ) -R (I den /2 i )
The additional transistor drawn with grey lines in Figure 10 is included to reduce glitches that can be very important in these cells because the currents involved are high. the digital word that represents the quotient I num /I den is calculated. When S is equal to "1" (H is "0"), the bits {b i } that control the D/A part are updated at the same time. The bits {b i } do not change when H is "1", thus maintaining both the analog (I out ) and the digital ({b i }) outputs while performing a new division.
The longest delay of the D/A converter appears when the digital code changes from 00...0 to 11...1 because all the nodes a (in Figure 12) have to be loaded starting from V ss . The typical solution to reduce this delay is to replace the single switches by differential switches that always offer a conducting path for the current [29] [30] [31] [32] [33] . In our case, we have introduced the transistor M a shown with grey lines in Figure 12 . It reduces the voltage swing at node a because when b i is "0" the voltage at a is not V ss but a bigger value close to a reference voltage, V ref (which is V dd in Figure 13) . Figure 14 shows the Hspice transient simulation of a 5-bit D/A converter when the digital code changes from 00000 to 11111 (the reference current I p is 16µA). The grey and black lines correspond, respectively, to a converter that uses simple and differential switches.
When b i increases from "0" to "1", the additional pmos transistor (M a in Figure 12 The control output is obtained from an average in which each consequent value, c i , is weighted by the activation degree, h i , of its corresponding rule:
When implementing these fuzzy controllers as VLSI circuits, the signals that represent the numerator and denominator of the above expression are usually currents because currents can be added by simply connecting wires. The proposed multiplier/divider circuit is very well suited to implement equation (8) 
The current I o is the control output in analog format. Hence, the proposed multiplier/divider circuit enables IC realizations of fuzzy controllers that can interact directly with analog actuators and subsequent digital processing systems.
The A/D-D/A circuit of 5 bits described in Section III was included as a divider (hence, as the output block) in a two-input one-output fuzzy controller that was integrated in a 2.4-µm CMOS technology [34] . This fuzzy controller chip accepts inputs represented by voltages or currents and provides the control output in analog and digital format. Figure 19a shows the control output in analog format provided by the D/A part of the divider when sweeping one input variable and changing the other input as shown in Figure 19b sponse time showed to be less than 2µs, which means an inference speed above 0.5 MFLIPS (Mega
Fuzzy Logic Inferences Per Second). This is a high inference speed considering that the multiplier/ divider circuit was not optimized regarding speed, as commented in Section III.
V.-Conclusions.
A current-mode multiplier/divider based on the successive approximation technique has been proposed and its static and dynamic behavior has been analyzed. It employs continuous-time algorithmic data converters whose bit cells enable a better trade-off speed/power than others previously reported while maintaining a low area occupation. Simulation and experimental results from a 2.4-µm CMOS prototype of 5-bit resolution verify these features: small silicon area (0.077 mm 2 ), capability of working at low voltage supply (3 V), and high speed (response time of a few hundreds of nanoseconds for a power consumption below the milliwatt). The proposal is suitable for applications of low or middle resolution (up to 9 bits) and offers the flexibility of providing the division value in both digital and analog formats. In particular, it has been applied to implement the output divider block of a fuzzy controller chip that can interact directly with analog actuators and digital processing environments. 
