Abstract-Mismatch-shaping digital-to-analog converters (DACs) have become widely used in high-performance delta-sigma data converters because they facilitate delta-sigma modulators with multibit quantization. Relative to single-bit quantization, multibit quantization significantly relaxes the analog circuit performance necessary to achieve a given level of data converter precision, but significant digital logic is required to perform the mismatch shaping. In modern very large scale integration processes optimized for digital circuitry, this tends to be a good tradeoff in terms of both area and power consumption. It is nonetheless desirable to minimize the digital complexity as much as possible. Moreover, in delta-sigma analog-to-digital converters the mismatch-shaping logic is in the feedback path of the delta-sigma modulator, so it is essential to maintain a sufficiently small propagation delay through the mismatch-shaping logic. This paper presents and analyzes several variations of the switching blocks within a tree-structured mismatch-shaping DAC that result in the most hardware-efficient first-order and second-order mismatch-shaping DAC implementations yet known to the authors. The variations presented allow designers to tradeoff complexity for propagation-delay reduction so as to tailor designs to specific applications.
I. INTRODUCTION

I
N DATA CONVERTERS, both analog-to-digital converters (ADCs) and digital-to-analog converters (DACs), coarse quantization is used in conjunction with quantization-noise shaping and filtering to achieve high-precision data conversion. In both cases, coarse DACs are required. Unlike the error introduced by the coarse quantization, the error introduced by at least one of the coarse DACs in a data converter is not attenuated inside the data converter's signal band. In switched-capacitor implementations, most of the DAC error arises from static capacitor mismatches, which give rise to step-size mismatches in the multibit DAC. The resulting step-size mismatches are memoryless functions of the DAC input, so the DAC can be viewed as an ideal DAC followed by a memoryless nonlinear function. The nonlinearity tends to fold out-of-band quantization noise into the signal band, thereby limiting the overall accuracy of the data converter.
To avoid this problem, many data converters employ 1-bit quantization. With 1-bit quantization, the coarse DAC is implemented by a 1-bit DAC. Since a 1-bit DAC only generates two levels, it only has one step, and so it is inherently linear. However, with 1-bit quantization in the modulator, quantization-noise shaping must be limited to maintain the modulator's stability. Additionally, the power of the quantization noise in the 1-bit modulator exceeds that of its input, so data converters with 1-bit quantization are extremely sensitive to any nonlinearity or timing error, such as op-amp slewing or clock jitter, which can fold this quantization noise into the signal band.
To avoid these problems, multibit mismatch-shaping DACs have been developed [1] - [52] . In these DACs, digital logic is used to scramble the DAC capacitor or current-source connections in such a fashion that the error introduced by the device mismatches, referred to as DAC noise, is suppressed within the data converter's signal band. For low-pass mismatch-shaping DACs, the DAC noise is suppressed near dc so that its power spectral density (PSD) is shaped like the magnitude response of a first-order, or in some cases, second-order high-pass filter. The five main classes of mismatch-shaping DACs include individual-level averaging (ILA) [11] , [12] , vector feedback [13] - [16] , data-weighted averaging (DWA) [17] - [31] , butterfly shuffler [32] - [37] , and tree structured [38] - [44] . The criteria used to compare these DACs include complexity, propagation delay, spurious-tone avoidance, and the order, or degree, of DAC noise suppression.
In [40] , a tree-structured mismatch-shaping DACs is introduced that has led to the most efficient implementations of dithered first-order and second-order mismatch-shaping DACs known to the authors [43] , [44] . Moreover, the first-order tree-structured DAC is the only one for which dither is known to completely eliminate spurious tones in its DAC noise [46] . This paper furthers the development of this DAC by presenting new implementations of its digital logic that are more hardware efficient and have less propagation delay than those presented in [40] . The digital logic is first partitioned into functional blocks, one of which determines the shape of the DAC noise's PSD and another that is responsible for the digital logic's propagation delay. The hardware for the digital logic is presented through interchangeable variations of these functional blocks so that the DAC can be tailored to meet varying specifications for signal-band DAC noise power, propagation delay, and complexity. Efficient first and second-order mismatch-shaping logic are presented and the resulting DAC noise from each is analyzed to show it has the desired spectral shape. Additionally, medium-speed and high-speed implementations of the DAC are presented that offer a tradeoff between propagation delay and complexity. This paper is divided into six sections. Section II reviews the tree-structured DAC and presents the functional partitioning of its digital logic. Additionally, this section presents an example application of a 5-bit second-order ADC modulator that is used throughout the paper to illustrate the DAC performance and complexity. Section III presents and analyzes the first-order and second-order mismatch-shaping logic, while Section IV presents the medium-speed and high-speed implementations of the DAC. Section V presents a hardware comparison between the different tree-structured DAC implementations and other mismatch-shaping DACs presented in literature.
II. THE TREE-STRUCTURED DAC
A. The Modulator Application
The 5-bit ADC modulator presented in [43] is shown in Fig. 1 . It consists of two delayed switched-capacitor integrators, a 33-level flash ADC, and two 33-level DACs. As shown in Figs. 1 and 2 , each 33-level DAC consists of a bank of 32 DAC elements and a shared digital encoder whose outputs, , are 1-bit sequences. Each DAC element can be viewed as a 1-bit DAC whose analog output is a charge packet applied to the summing node of an integrator. A DAC element is said to be "selected high" when its input is high; otherwise, it is said to be "selected low." For convenience, the output of the ADC, , is interpreted as an integer between 0 and 32. For each ADC output sample, the digital encoder chooses which of the DAC elements to select high and which of the DAC elements to select low. In other words, if is interpreted numerically as one when high and zero when low, the DAC encoder ensures that . Mismatches among the capacitor values of the DAC elements cause the output of each multibit DAC to be a nonlinear function of its input. The resulting nonlinear error is represented, without approximation, as an additive noise source referred to as DAC noise. As shown in Fig. 1 , an output from one of the DACs is added to the modulator's input. Thus, the modulator does not attenuate any of the signal-band noise power from this DAC. However, the digital encoder can select the DAC elements such that most of the DAC noise power resides outside of the signal band.
To demonstrate the improvements that are realized by mismatch shaping, the DAC presented in [44] was tested with and without mismatch shaping. The input for each test was a 1.5 kHz, dB (relative to full scale) sinusoid. With mismatch shaping, the resultant signal-to-noise-and-distortion ratio (SINAD) was 100 dB, whereas without mismatch shaping, the resultant SINAD was 64 dB. In general, the tradeoff for the improved performance is the additional hardware and propagation delay incurred by the digital encoder. However, the propagation delay of the digital encoder only affects the design of high-speed data converters. Examples of commercially available data converters that employ mismatch-shaping DACs to a similar advantage are presented in [47] - [52] . 
B. The Tree-Structured Digital Encoder
The architecture for a 33-level, tree-structured digital encoder is shown in Fig. 3 . The nodes of this digital encoder are called switching blocks. Each switching block is labeled , where and represent the switching block's layer number and position within the layer, respectively. Each switching block has a single input, which is denoted , and two outputs. If each digital encoder output sequence is also denoted , then the switching blocks are interconnected such that the top output of is and the bottom output is . The switching sequence is defined as the difference between the top and bottom output sequences of (1) Fig. 4 illustrates the input and output sequences of along with the relationship between its switching sequence and output sequences.
As shown in [40] , the DAC noise is a linear combination of the switching sequences. In general, for a DAC of the type shown in Fig. 3 with DAC elements, the output can be written as (2) where (3) and , and are constants that are functions of the inevitable, static errors that result from process variations during VLSI circuit fabrication.
Therefore, if the switching sequences are all uncorrelated and share the same characteristics in their PSDs (e.g., first-order high-pass shaping), the DAC noise also possesses these characteristics. The problem of shaping the PSD of the DAC noise reduces to the problem of creating switching sequences with the desired spectral shaping. Unfortunately, this problem is complicated by the constraints on the switching sequence described next.
C. Constraints on the Switching Sequence
The switching sequence is generated within the switching block to obtain the desired spectral properties of the DAC noise. However, the switching sequences must be constrained to satisfy restrictions inherent to the digital encoder. As previously described, each of the digital encoder's outputs, , is limited to the set and their sum must equal the DAC input:
. It is shown in [40] that these conditions are met if each switching block satisfies the following two-part Number Conservation Rule: the two outputs of each switching block must be in the range where is the layer number, and their sum must equal the input to the switching block (4) From (1) and (4), the input/output relationships of switching block are and
The above expressions are implemented by the block diagram shown in Fig. 5 . It can be shown that the number conservation rule is satisfied by each switching block if if is even if is odd (6) This is more restrictive than necessary; however, it significantly simplifies the switching block's hardware. In Fig. 5 , this restriction is reflected by the switching sequence generator's dependence on the input sequence . 
D. Implementation of the Switching Block
The switching sequence is a ternary sequence, and so it can be represented as two single-bit sequences. It follows from (6) that the magnitude of the switching sequence is entirely determined by the input to the switching block, so the switching block can only control the sign of the switching sequence. To separate the magnitude and sign of the switching sequence, let and represent as if if if (7) where if is odd if is even (8) The sequence represents the sign of . It is chosen by the switching block to ensure the switching sequence is appropriately shaped as described in Section III. The sequence is referred to as the parity sequence and represents the magnitude of . Fig. 6 displays a convenient functional partitioning of the switching block. The parity logic determines the parity of the switching block's input and generates the parity sequence . The sequencing logic produces the sign sequence and is responsible for the spectral shaping of the switching sequence. The combination of the sequencing logic and parity logic constitute the switching sequence generator shown in Fig. 5 . Given and the binary representation of (i.e., and ), the role of the splitting network is to perform the arithmetic operations shown in Fig. 5 that generate the switching block's two output sequences.
III. LOW-PASS SEQUENCING LOGIC
A. High-Pass Switching Sequences
In low-pass mismatch-shaping DACs, the signal band is near dc, so the mismatch-shaping logic is designed such that most of the DAC noise power resides at higher frequencies. In other words, in a low-pass mismatch-shaping DAC, the PSD of the DAC noise resembles the magnitude response of a discrete-time high-pass filter. Sequences of this type are called high-pass sequences. Thus, the sequencing logic blocks in a low-pass treestructured DAC create high-pass switching sequences.
To meaningfully characterize the spectral properties of the high-pass switching sequences, a quantitative definition of an th-order high-pass switching sequence is required. In a modulator with a quantization-noise transfer function that contains zeros only at dc, the order of the modulator corresponds to the number of dc zeros. Let quantization noise denote the component of the modulator output arising from the errors induced by quantization. In an th-order low-pass modulator, the quantization noise is commonly called th-order high-pass noise. A key property of this high-pass noise is that it can be processed by cascaded accumulators such that the values in the accumulators remain bounded. The dc poles from the accumulators "cancel" the dc zeros in the noise transfer function. However, if one more accumulator were cascaded, its output would become unbounded regardless of the accumulators' initial values.
In contrast to the quantization noise, the switching sequence, as a result of its constraints in (6) , cannot be generated by filtering a causal, bounded sequence by a system with dc zeros. Therefore, the concept of the switching sequence's order is vague without a more applicable definition. By defining the high-pass order of a switching sequence using the accumulator property described above, a transfer function is associated with this sequence, and the desired properties of its PSD are implied.
Definition: Let be the " th-sum sequence" of L Summations (9) The sequence is an th-order high-pass switching sequence if its th-sum sequence is a bounded sequence-i.e., there exists a number such that for all -, and its st-sum sequence is an unbounded sequence. If is an th-order high-pass switching sequence, then it can be shown that the slope of its PSD is dB/decade near dc provided the PSD of is continuous and nonzero in a neighborhood of dc. This definition provides a means to create switching sequences that are th-order high-pass shaped and conform to (6) .
B. First-Order Low-Pass Sequencing Logic
To produce a switching sequence that is a first-order high-pass switching sequence, the switching block ensures that its partial sum, , is a bounded sequence. Suppose the input to switching block is always odd and thus, from (6), for all . One method for ensuring that is a bounded sequence is by choosing to be the alternating sequence:
. With this switching sequence, the resulting partial sum sequence is bounded in magnitude by 1 (10) and the resulting switching sequence is a single tone of normalized frequency . For many applications, it is desirable to have DAC noise and, thus, switching sequences that do not contain any tones.
One way to eliminate tones in this scenario and yet obtain a first-order high-pass switching sequence is to construct by randomly choosing between the following two types of symbols: " " and " ". When is even (i.e., ), one of the two symbols is chosen randomly by a fair coin toss, and the chosen symbol is placed in the switching sequence. With this construction, the switching sequence can be written as and . The alternating property-i.e.,
-ensures that the partial sum sequence satisfies (10) , while the random symbol type selection prevents from containing any periodicities. Therefore, the resulting switching sequence is a first-order high-pass switching sequence that does not contain tones.
This method of using symbols to construct the switching sequence can be generalized to include even inputs to the switching block. When the switching block's input is even, it follows from (6) that the switching block has no choice but to force the switching sequence to be zero. To include potential zero runs in the switching sequence, the two symbols described above are generalized to be (11) Each symbol begins in the switching sequence with a nonzero value that corresponds to an odd switching block input. The only other nonzero element within a symbol has the alternate sign of the first element. For a switching sequence composed of these symbols, this alternating property ensures that its partial sum satisfies (10) , which implies that the resulting switching sequence is a first-order high-pass switching sequence. Additionally, by randomly choosing between the two symbol types, the resulting switching sequence cannot contain tones.
As an example, consider the following segment of the input sequence to the switching block where the segment starts with the value "1" and ends with the value "2". The parity sequence for this input is
The parity sequence dictates the magnitude of the switching sequence ; therefore, the zeros in the parity sequence correspond to zeros in the switching sequence. Given this parity sequence, the symbols " " and " " are used to construct the switching sequence The choice of the symbol " " over " " and "
" over " " is arbitrary as any combination of these symbols ensure that . In this example, the resulting partial sum sequence is Additionally, a pseudorandom sequence is used to select between the two symbol types and is generated by logic that is not shown in the figure.
Each symbol type from (11) must be further decomposed into two "halves" to describe how the sequencing logic in Fig. 7 generates the desired switching sequence. The first half of the symbol-i.e., the first " "-is called the head of the symbol, and the second half is called the tail. The four states of the D flip-flops correspond to the two symbol types in and the two segments, head and tail, of the symbol. The bit in the leftmost flip-flop represents the value of . Since when is an element of a symbol's head, and when is an element of a symbol's tail, the bit in the leftmost flip-flop tracks whether is an element of the head or the tail of a symbol. The rightmost flip-flop contains the sequence that dictates the symbol types. The symbol types are chosen randomly according to the pseudorandom sequence so that there are no tones in the switching sequence. This pseudorandom sequence is called the dither sequence, and a switching block that uses a dither sequence to select its symbol types is called a dithered switching block. Ideally, the dither sequence is a sequence of bits that are uniformly distributed and independent. In this implementation, each switching block in a given layer shares the same dither sequence.
Undithered switching blocks may also be utilized to reduce hardware complexity and potentially decrease signal-band DAC noise power. In an undithered switching block, the same symbol type is used throughout the switching sequence, and the sequencing logic can be reduced to a single D flip-flop with enable. The resulting switching sequence can contain tones that lower the noise floor of its PSD relative to the dithered case. This reduced noise floor can give rise to less signal-band DAC noise power. However, the resulting spurious tones in the DAC noise can be prohibitive for a given application. To optimize this tradeoff, some combination of dithered and undithered switching blocks may be employed. Fig. 8 displays the PSDs of the DAC noise and quantization noise from behavioral simulations of the 5-bit modulator that was introduced in Section II. The units of the PSDs are dB relative to , where is the step size of the ADC. The capacitor errors in the DAC banks were modeled as independent Gaussian random variables with standard deviations of 1% of their nominal value. This is not equivalent to "1% matching" which implies that adjacent capacitors in a given IC are matched within 1%. The input to the modulator was a 1 dB (relative to full-scale), 1 kHz sinusoid. To illustrate the effects of dither, a dither sequence was applied to selected switching blocks in the simulated modulator. The noise PSDs in Fig. 8 illustrate how the dither sequences either eliminate or reduce spurious tones in the DAC noise depending on which switching blocks are dithered.
The total hardware required for the sequencing logic in a -level digital encoder depends on how many switching blocks are dithered. When all switching blocks are dithered, D flip-flops with enables, 2:1 multiplexers, and pseudorandom sequences are required. When none of the switching blocks are dithered, D flip-flops with enables are required. For the implementation of the modulator in [43] , the 2:1 multiplexer in the sequencing logic is realized by three NAND gates, and the pseudorandom sequences are constructed using a pseudorandom number generator with 28 D flip-flops and 7 XOR gates. The total hardware required for the sequencing logic (not including the pseudorandom number generator) in the digital encoder presented in [43] is 62 D flip-flops and 93 NAND gates.
C. Second-Order Low-Pass Sequencing Logic
The first-order low-pass sequencing logic generates a first-order high-pass switching sequence regardless of the values in the switching block's input sequence. However, the restrictions on given by (6) prevent an analogous claim for the second-order low-pass sequencing logic. For to be a second-order high-pass switching sequence, the switching block attempts to bound the magnitude of its double sum sequence by a constant for all
Because the parity of dictates when is zero, the sequence can be made arbitrarily large by applying the appropriate . For example, suppose , and for all . Given , then and for all . However, if is odd with some regularity (as is the case when the DAC is used in a modulator), a switching sequence can be constructed whose double sum is a bounded sequence, thereby giving rise to second-order high-pass shaped DAC noise.
One method for creating such a switching sequence is to again use symbols of the form in (11), but with the symbol type chosen to minimize the magnitude of the double sum sequence, . In this case, the magnitude of is bounded by one, so the switching sequence is at least a first-order high-pass switching sequence. At any time within a symbol's head, , and it follows that
Thus, increments or decrements by one at each sample time within a symbol's head. However, at any time within a symbol's tail, and . It follows that the symbol's type and the length of its head determine the values in : if a symbol starts at time and its head's length is samples, it can be shown using induction that (13) where the sign of is determined by the symbol type. To minimize , the sign of in the above expression should be the opposite of the sign of . To construct such a switching sequence, each switching block ideally calculates with which it selects between the two symbol types. However, when implemented with finite register sizes, the switching block can only estimate . This estimate, which is denoted , has a maximum and minimum which are determined by the number of states in a finite-state machine. Therefore, the estimate (15) where is called the saturation error. The behavior of the saturation error determines whether the switching sequence is a second-order high-pass switching sequence. Since is constrained to the set , it follows that is also constrained to this set. Let . For to be nonzero, there must be a run of at least zeros in the parity sequence . Thus, must be even for consecutive samples to cause any saturation error. If the switching block's input is odd at least once within every -length segment, the saturation error is always zero. From (15) , it follows that
The sequence is the partial sum of ; thus, it follows that (16) Because is a bounded sequence, is a bounded sequence if and only if the partial sum of is a bounded sequence. Therefore, is a second-order high-pass switching sequence if and only if the partial sum of is a bounded sequence.
The second-order low-pass sequencing logic is shown in Fig. 9 . The 3-state accumulator produces and the -state accumulator produces . Therefore, the sign of the value in the -state accumulator is used to choose the symbol types. However, when the -state accumulator's value and hence is zero, the dither sequence is Fig. 9 . The second-order, low-pass sequencing logic with dither.
used to choose the symbol type randomly as a means of reducing the spurious tones in . The 3-state accumulator tracks the intrasymbol information for the switching sequence:
when is an element of the symbol's head, and when is an element of the symbol's tail. When is in the head of a symbol, the sign of the 3-state accumulator's value is the sign of the tail's first element. The following is a more detailed description of each element in Fig. 9 (20) where is a pseudorandom sequence that approximates a sequence of bits that are uniformly distributed and independent. Fig. 10 displays DAC noise PSDs from behavioral simulations of the 5-bit modulator presented in Section II with the second-order low-pass sequencing logic. Except for the sequencing logic, all other characteristics of these simulations were the same as those for the first-order low-pass case described previously. Various -state accumulators were implemented with counters of different sizes. For smaller values of , the saturation error contributes more power to the DAC noise. In the limit when "no counter" is used (i.e., when and for all ), the sequencing logic reduces to the first-order low-pass sequencing logic. When the -state accumulator is implemented with a 4-bit counter, the power of the signal-band DAC noise decreases relative to the "no counter" noise, but the saturation error prevents the DAC noise from being second-order high-pass shaped. However, with the -state accumulator realized by an 8-bit counter, the DAC noise in Fig. 10 has the spectral shape of a second-order high-pass sequence.
The additional hardware required to implement the second-order sequencing logic relative to the first-order sequencing logic includes the decision logic, which can be implemented by two 2:1 multiplexers and an inverter, and the -state accumulator. If and the -state accumulator is implemented with a -bit up/down counter, then is the MSB of the counter and " " can be realized by an OR gate with a fan-in of bits. The second-order sequencing logic for the implementation of this switching block in [44] uses a 4-bit counter and requires 25 total gates and flip-flops.
IV. SPLITTING NETWORK AND PARITY LOGIC
In an ADC modulator as in Fig. 1 , the delay of the feedback DAC must be small enough so that its output is available well before the next modulator input is clocked in. Therefore, the delay introduced by the switching blocks can limit the maximum sample rate of the ADC modulator. The sequencing logic blocks presented in Section III do not contribute to the switching block's propagation delay because their outputs can be set before their next input is available. However, the splitting network and parity logic do cause propagation delay.
If the input to the switching block were a binary encoded number, the splitting network could be implemented with binary adders as shown in Fig. 5 , and the parity logic would require no hardware as the input's parity bit would be its LSB. However, the propagation delay introduced by the adders could be significant. In this section, splitting networks are presented that avoid using conventional adders by employing alternative coding schemes for the switching blocks' input and output sequences. Without conventional adders, these splitting networks tend to introduce less propagation delay. The two splitting networks in this section constitute the medium-speed and highspeed switching blocks that offer a tradeoff between complexity and propagation delay. Additionally, efficient implementations of the parity logic blocks are presented for both switching block types. Fig. 11 displays the medium-speed switching block. The parity logic consists of an XOR gate and the splitting network consists of two 2:1 multiplexers. In this section, the sequence " " represents both the input of and its numerical value; the appropriate representation can be determined by its context. The switching block employs "extra-LSB encoding" of its input and output sequences. Motivated by [39] and detailed in [42] , the extra-LSB code of consists of bits that are denoted , each of which take on a value of one or zero. The numerical value of is interpreted as (21) Thus, the extra-LSB code contains two LSBs, and , both with unity weighting. A conventional unsigned binary encoded number can be converted to an extra-LSB encoded number by appending the 0th bit and setting it low.
A. Medium-Speed Switching Block
With this coding technique, the arithmetic performed by the splitting network only modifies the two LSBs of . As described in Section II, the switching sequence is nonzero only when is odd. It follows from (22) that whenever is odd, one of its LSBs is one and the other is zero. Thus, the splitting network adds to when only one of its LSBs is high, which implies that the carry bit can never propagate beyond the two LSBs. When is odd, the splitting network adds one to by setting both of its LSBs high, or subtracts one from by setting both of its LSBs low. Since the sequences and are always even valued, both LSBs are equal in each of these sequences. The splitting network performs the divide-by-two operation by right shifting the MSBs of and using one of the LSBs of as the second LSB of each output. The two LSBs of determine its parity. The value of is odd only when one of its LSBs is one and the other is zero; otherwise, it is even. Therefore, the parity logic implements (22) where represents the XOR operation.
The hardware in each switching block is independent of its location in the digital encoder; therefore, the -level digital encoder requires 2:1 multiplexers for its splitting networks and XOR gates for its parity logic. The efficiency of this implementation increases as the number of bits are increased because the complexity of each switching block does not depend on the bit width-i.e., number of bits-of its input. The medium-speed switching block is used in the 33-level digital encoder presented in [43] wherein the two multiplexers of each splitting network are realized by five NAND gates. The splitting networks and parity logic blocks for this implementation require a total of 186 logic gates. For the shown in Fig. 1 , additional hardware is required to convert the thermometer coded output of the flash ADC to an extra-LSB code.
The delay performance for the digital encoder is determined by the digital encoder's critical path, which is defined as the longest path that an input bit must traverse in a given clock period to set an output bit. Within the medium-speed switching block, the longest path from its input to its outputs consists of an XOR gate and a 2:1 multiplexer. Therefore, the critical path of the -level digital encoder consists of XOR gates and 2:1 multiplexers. HSPICE 0.5-m CMOS simulations of the 33-level digital encoder presented in [43] showed that this digital encoder has a delay of approximately 5.7 ns. This does not include the propagation delay of the thermometer-to-binary conversion performed in the 's digital common-mode rejection flash ADC. Fig. 12 displays an example high-speed switching block whose splitting network consists entirely of switches implemented by CMOS transmission gates. In this architecture, the parity logic does not physically reside within the switching block. The parity sequences are generated by an XOR tree as shown in Fig. 13 . The high-speed switching block employs thermometer encoding of its input and output sequences. The sequence is thermometer encoded if it has bits that are assigned as follows:
B. The High-Speed Switching Block
Thus, with thermometer encoding (24) With thermometer encoding, the splitting network performs the desired arithmetic by routing the odd indexed bits of to one output and the even indexed bits of to the other output, or vice versa, depending upon . It can be shown that the numerical values of the sequences that comprise the even indexed bits and odd indexed bits of are (25) and (26) respectively. Because is limited to the set , it follows that if if (27) 
Therefore, by routing the input's even and odd indexed bits to separate outputs based on , the splitting network realizes the arithmetic in (27) and (28) . Moreover, by preserving the order of these bits, the splitting network ensures its outputs are thermometer encoded.
Since the splitting network does not rely on to route the bits of , the current sample of can be determined after the outputs of the digital encoder are set. The parity logic block in this section exploits this flexibility to minimize its hardware. The number of gates required to directly determine the parity of a thermometer encoded number is proportional to its bit width. However, using the XOR tree as shown in Fig. 13 , each parity logic block accounts for only one XOR gate.
The XOR tree is a consequence of the functional relationship between the outputs of a switching block and its input. From (4) , the values of the output sequences of a switching block must add to the value of the input. Thus, the parity of can be determined by the parities of and (29) The outputs of each switching block in layer one are 1-bit sequences. This implies that . By recursively implementing (29) , the XOR tree generates the parity bits for each switching block.
The hardware counts in the medium-speed and high-speed switching blocks differ only in their splitting networks. With the high-speed switching block, the number of transmission gates in the splitting network depends on the bit width of the switching block's input. However, the number of transmission gates per layer is independent of the layer number: each bit of a switching block's input is processed by two transmission gates-one on and one off-and the total number of input bits is constant for each layer. Thus, with the high-speed switching block, the -level digital encoder requires transmission gates for its splitting networks and XOR gates for its parity logic. A 33-level implementation of this digital encoder for the shown in Fig. 1 requires 320 transmission gates for its splitting network and 31 XOR gates for its parity logic. If the input to the digital encoder were a binary encoded number, as in the case of a , a binary-to-thermometer encoder would also be required to implement this digital encoder.
The high-speed switching block tends to have less propagation delay than the medium-speed switching block because the parity logic in the high-speed switching block does not contribute to its delay. As previously mentioned, the sequencing logic does not require the current sample of to produce . Therefore, can be calculated and used to set the transmission gates before is clocked into the digital encoder. Additionally, the XOR tree processes the output bits of the digital encoder and does not contribute to the digital encoder's critical path. Therefore, the critical path of the -level digital encoder, which is experienced by each of its input bits, consists of preset transmission gates. HSPICE 0.5-m CMOS simulations of a 33-level digital encoder with high-speed switching blocks showed that this digital encoder has a delay of approximately 1.1 ns, which is approximately a five-times improvement over a 33-level digital encoder with medium-speed switching blocks. This delay does not include the propagation delay of a binary-to-thermometer encoder that would be required in a . An implementation that uses the high-speed switching blocks for its minimal delay is presented in [45] .
V. HARDWARE COMPARISON FOR VARIOUS MISMATCH-SHAPING DACS
To compare the hardware complexity of the tree-structured mismatch-shaping DAC encoders presented here to other implementations, Tables I and II give estimated hardware requirements for mismatch-shaping DAC encoders appropriate for use . When possible, the DAC encoder hardware is estimated for an implementation in the 5-bit shown in Fig. 1 . In both tables, the abbreviations "INV," "MUX," "XOR," and "XNOR" stand for inverter, 2:1 multiplexer, exclusive-or, and exclusive-nor, respectively. The abbreviation "T-gate" denotes a two-transistor CMOS transmission gate, and the abbreviation "T/B encoder" denotes a thermometer-to-binary encoder. A D flip-flop, denoted "DFF," is assumed to have true and complemented outputs available; the D flip-flop with enable, shown in Fig. 7 , is implemented using a D flip-flop and a 2:1 multiplexer.
The mismatch-shaping DAC encoders shown in Table I provide no hardware to eliminate or reduce spurious tones and the hardware differences are not as pronounced. However, when extra hardware is utilized to combat harmonic distortion, Table II shows that both the Bidirectional DWA (BiDWA) and tree-structured DAC encoders contain the least hardware. The BiDWA DAC encoder requires minimal hardware because it depends entirely on the randomness of its input to reduce tones in its resulting DAC noise. Any dc input to a -level BiDWA DAC, besides the trivial inputs of 0 and , causes its DAC noise to be tonal. On the other hand, the dithered tree-structured DAC has been mathematically proven to produce no tones in its DAC noise [46] . In the Butterfly Shuffler architecture, it is assumed that the logic driving the swapper cells is implemented as the sequencing logic for the tree-structured DAC and only one random bit is used for each column in the swapper cell matrix. For the second-order tree-structured DAC, the hardware difference becomes more pronounced as the 5-bit implementation presented in [44] requires only 988 gates while the 3-bit second-order architecture presented in [15] requires 3500 gates.
VI. CONCLUSION
This paper has presented various implementations of the tree-structured mismatch-shaping DAC. First-order and second-order low-pass sequencing logic have been presented that provide a tradeoff between DAC-noise power and hardware complexity. High-speed and medium-speed implementations of the splitting network and parity logic have been presented that offer a tradeoff between the digital encoder's propagation delay and hardware complexity. By appropriately choosing between medium-speed, high-speed, first-order dithered or nondithered, or second-order implementations, the tree-structured DAC can be optimized for hardware complexity, propagation delay, signal-band DAC-noise power, and DAC-noise harmonic distortion.
