It is well known that constant-time addition, in which the execution delay is independent of operand lengths, is feasible only if the output is expressed in a redundant representation. There are many ways of introducing redundancy, and the specifics of the redundant format employed can have a major impact on the performance of constant-time addition and digit set conversion. This paper presents a comprehensive analysis of constant-time addition and simultaneous format conversion. We consider full as well as partially redundant representations, where not all digit positions are redundant. The number of redundant digits and their positions can be arbitrary, yielding many possible redundant representations. Format conversion refers to changing the number and/or position of redundant digits in a representation. It is shown that such a format conversion is feasible during (i.e., simultaneous with) constant time addition, even if all three operands (the two inputs and single output) are represented in distinct redundant formats. We exploit "equal-weight grouping" (EWG) wherein, bits having the same weight are grouped together to achieve the constant-time addition and possible simultaneous format conversion. The analysis and data show that EWG leads to efficient implementations. We compare VLSI implementations of various constant-time addition cells and demonstrate that the conventional 4:2 compressor is the most efficient way to execute constant time-addition. We show interesting connections to prior results and indicate possible directions for further extensions.
Introduction
A positional radix-β number system represents an n-digit value V as a string of digits,´d n 1 In other words, a given number system is redundant if there exists an n-digit value V which satisfies
and there is at least one position j where d j d ¼ j . This implies that for some digit position k, the cardinality of the digit set D k satisfies D k β. We call such a position, a redundant digit position.
Addition can be thought to be an instance of the digit set conversion problem [2, 3, 4, 5] . In this context, 
where c i 1 is the carry-in to position i, c i is the carry-out, and both are members of a carry set C . An alternate treatment of addition based on digit set operations can be found in [1] , which provides a framework for designing adders based on contiguous sets.
Constant-Time Addition
Constant-time addition is possible at a redundant digit position i if the value of c i can be determined by considering only a fixed number of previous input digits, making it independent of c i 1 . The number of previous digits required constitutes a right context, or look-back [2, 3, 4] and is henceforth denoted by L.
The operation of constant-time addition at redundant digit positions can be explained conceptually as the two-step process described below [6] .
Step 1: Based on its fixed right context, every redundant digit position generates an intermediate sum, σ i , and an intermediate carry-out, c i , where
In other words, (2) expresses the sum of the operand digits θ i x i · y i as the pair´c i σ i µ.
Step 2: The final sum z i is formed by z i σ i · c i 1 , where
If there are non-redundant digit positions in the result Z, carries must ripple through them [7, 8] and they are determined by (1).
As described in [2, 3, 4] , the carry-out of a digit position can depend on the input operands and the output digit set at that position, as well as the operands and output digit sets at all digit positions that fall within fixed-length right and left contexts. If a left context is actually used, the carry into some position can be dependent on the input operands at that position.
Radix-2 Redundant Representations
Since radix-2 representations are the most commonly used, this paper concentrates only on those representations based on underlying radix-2 digit sets. As specific examples we consider redundant digit sets that are variants of the well-known signed-digit´SDµ and carry-save´CSµ representations. These digit sets are defined as 
D´S
We consider two types of number systems based on each of these digit sets; a fully redundant system and a partially-redundant system. A fully redundant number system is one in which all digit positions of a number are redundant and the characteristics of such systems are well known [9, 10, 11, 12] . Implementations of adders for fully redundant representations have also been widely investigated, a sampling can be found in [8, 13, 14, 15, 16, 17, 18] .
In a partially redundant number system only some digit positions are redundant [7, 8] . In this paper redundant digit positions are indicated by rectangles or squares (see Figure 1 ) and digits in these positions can assume any of the values from one of the sets listed in (3)-(4) above. It is possible to use different redundant digit sets (from the above list) at different positions, but for the sake of simplicity, we restrict ourselves to representations where all redundant digit positions have the same digit set. Format conversion therefore refers to changing the number of redundant digits and their positions in a representation, while retaining the same digit set at each redundant position.
Non-redundant digits are represented by circles and require only a single bit to encode the possible values which they can assume, namely 0 1 . Some possible partially-redundant formats are illustrated in Figure 1 . As the figure shows, redundant digits can be placed at arbitrary positions.
Rectangles/squares represent redundant digits. non-redundant digits (bits).
Circles represent

X Y
·
Look-back in the general case: carry-out c q depends on all digits in this range.
Z
Carry c q ripples to the´q · 1µ-th redundant digit. Note that a redundant binary digit needs at least two bits to represent it. In fact, all of the redundant digit sets listed above need exactly two bits to represent their digit values. In essence, we consider redundant representations where some digit positions are allocated two bits and ask the question: given this basic redundancy, which number representations lead to the most efficient implementations and best exploit the available redundancy?
To answer this question, we consider the number representations listed in Table 1 . Among the partially redundant systems, we consider those where every k-th digit is redundant (i.e., the digits at positions k 1, 2k 1, ¡ ¡ ¡ , are redundant). Consequently, the partially redundant systems which we consider are denoted SD k, SD3 ¦ k, CS2 k and CS3 k. Note that these representations with uniform distance between redundant digits are equivalent to high-radix (2 k ) redundant representations. However, the addition methods presented herein lead to better implementations than those that can be derived simply by assuming high-radix arithmetic. Furthermore, formats with non-uniform distances between redundant digits cannot be considered high-radix (2 k ) representations. The methods developed in this paper are also applicable to formats where the placement of redundant digits is arbitrary.
Number System Description
SD
Digits at all positions ¾ D´S Dµ SD k
CS2
Digits at all positions ¾ D´C S2µ
CS2 k
Every k-th digit ¾ D´C S2µ ; all others ¾ 0 1
CS3
Digits at all positions ¾ D´C S3µ
CS3 k
Every k-th digit ¾ D´C S3µ ; all others ¾ 0 1 While the theory developed in [1] can be applied to the digit sets D´S Dµ , D´S D3 µ , D´S D3 · µ and D´C S3µ , it does not cover addition methods based on the CS2 number representation that are described in Section 7. Also, the approach taken in [1] does not cover addition at the non-redundant digit positions of SD k and CS2 k.
Redundant Binary Encodings
All of the number systems in Table 1 need two bits to represent each redundant digit. However, specific encodings should be chosen which lend themselves to efficient implementations. Consider the encoding of an operand X as´x n 1 x n 2 ¡ ¡ ¡ x 0 µ, where x i is the radix-2 (possibly redundant) digit in the i-th position. For clarity, a hat notation will be used to distinguish a redundant digit from a non-redundant digit (x i indicates that the i-th digit of X is redundant and is encoded using two bits, x j indicates that the j-th digit is non-redundant and is encoded using a single bit). The bits representing a redundant digitx i can be thought of as having higher and lower significant bits´x h i x l i µ, respectively. Note that arbitrary bit combinations can be used to represent a redundant digit,x i , but we concentrate on weighted encodings that satisfy the relationshipx
Weighted encodings for all digit sets of cardinality 4 (i.e., SD3 ¦ and CS3) must be of the form shown in (5) . Here, the bit x h i can be interpreted as a transfer-digit [6] . It will be shown that such encodings lead to efficient implementations.
In signed-digit representations we refer to the higher significant bit, x h i , as the polarity bit and the lower significant bit, The chosen encoding of both signed-digit and carry-save redundant digits ensures that x l i and x h i 1 have the same weight, i.e., the digitsx i andx i 1 overlap each other. This overlap can be exploited to reduce the range of digit sums that must be generated, and to predict the range of an incoming carry when two numbers are added. 
, would be in the range 0 θ i 6. This must be expressed as a final sum 0 z i 3 (assuming the output format is same as the input, i.e. CS3); and a carry-out, c i , which may be larger than 1. If EWG digits are added instead, the digit sum
. This is still expressed with a final sum of 0 z i 3 but the carry-out, c i , will be at most 1. As a result, the number of values needed for the carry-out is reduced.
Another benefit of working with bits originally from distinct digits arises when considering digit sets which exclude some bit patterns, as in CS2 or SD. In these cases, the higher-significant bits from the lesssignificant digits, x h i 1 and y h i 1 , provide some information about what range the less-significant digit sum, θ i 1 , is in and therefore the range of the incoming carry. Note however that in these cases, the range of the digit sum is not affected by the equal-weight grouping.
Partially Redundant Representations
While the fixed delay for constant-time addition is minimized when the output is fully redundant, other possibilities exist that address different design constraints (such as area or power). For example, a fullyredundant output requires twice as many bit-lines as a non-redundant output. To reduce the number of bitlines, the number of redundant output digits can be reduced. For the signed-digit family such a framework was illustrated in [7] , and a similar framework exists for the carry-save family [8] .
In general, two operands, X and Y , with redundant digits at arbitrary positions can be added to produce an output, Z, with redundant digits at positions completely unrelated to the redundant digit positions in either X or Y . It can be shown that such addition and simultaneous format conversion is possible in constant time, independent of the word-length [2, 3, 4, 7, 8] . Obviously the right context L depends on the specific operand formats in question. It can be verified that the worst-case delay (i.e., longest context or look-back) occurs when all of the digits in both operands X and Y are redundant and only some digits of the output Z are redundant. As shown in Figure 1 , the critical path delay of such constant-time addition and simultaneous format conversion depends mainly on the distance between redundant digits in the output. It can be shown that in all cases, the context that is sufficient to generate the correct intermediate sum and carry-out, c q , of the q-th redundant digit position includes all radix-2 digits up to (but not including) the´q 2µ-nd redundant digit, irrespective of whether or not the redundant digits are uniformly spaced. In other words, the context, L, now spans up to two larger groups or "super-digits". It may be possible to look at only the upper few digits of the previous group, thereby shortening L and the critical path. However, the critical path is still much longer than that achieved by the EWG scheme.
Given this framework for constant-time addition with and without simultaneous format conversion, we now consider the specific cases of the redundant radix-2 number systems listed in Table 1 and identify the ones that lead to efficient implementations. In Sections 2 and 3, we discuss addition of SD numbers with conversion to an output format of SD k and without such a conversion, respectively. In Sections 4 and 5 we consider SD3 ¦ addition with and without format conversion to SD3 ¦ k, respectively. Addition of CS2
and CS3 numbers, with and without format conversion, is discussed in Sections 6, 7, 8 and 9. Section 10 compares the ten previously presented number systems, and in Section 11 we show implementations of several adder cells and present the corresponding cell delays. Section 12 discusses some theoretical issues and final conclusions are presented in Section 13.
Once again, it should be noted that the uniform distance between redundant digits in the partially redundant formats considered in the following sections is only for the sake of illustration. The ensuing analysis and results are general and apply even when the distance between redundant digits in the output is nonuniform.
SD Addition with Format Conversion
The operation under consideration expresses its output in a partially redundant form. The two input operands, X and Y , are in the conventional signed-digit format with a digit set of D´S Dµ 1 0 1 . The output, Z, is expressed in the SD k format where every k-th digit´k 1µ is a member of D´S Dµ and the remaining digits are non-redundant bits. Note that because of the D´S Dµ digit set encoding, the bit patterń it will be shown below that a carry value of 2 is never needed.
Consider the r-th position´0 r k 2µ which has a non-redundant output and a weight of 2 r . The four input bits which have the weight of 2 r are´x l r y l r x h r 1 y h r 1 µ. Let their sum be denoted by θ r x l r · y l r · x h r 1 · y h r 1 , where 2 θ r ·2. The final output bit z r must satisfy the basic carry relationship stated in (1) . Using the definition of θ r , this becomes 2 ¡ c r · z r θ r · c r 1 (6) It would appear that since c r 1 can take any of three values, 1 0 1 , the sum θ r · c r 1 is in the range 3 ·3 . If the value 3 did occur, it would have to be expressed as´ 4 · 1µ which implies that a carry-out of c r 2 is needed. Fortunately, this situation never arises. In other words if θ r 2 then c r 1 0, i.e., the incoming carry cannot be negative. In fact, the following stronger result holds. 1, the digit sum This result in effect shows that θ i · c i 1 2, or in other words it will never be 3, thereby obviating the need for the carry value 2. Once this is established, the rules of operation at the unsigned (non-redundant) digit position are straightforward and are summarized by 
Next, we consider a position which has a redundant output digit. This position can generate the carry-out by looking only at the bits of the previous position. Note that Theorem 1 applies regardless of if the output is in redundant format. Also, since the output digit is redundant it can assume a value of 1 which allows multiple ways of expressing an output of ¦1. Table 2 summarizes the rules for constant-time addition and simultaneous format conversion at a redundant output position. Note that the intermediate sum σ i is determined so that for any possible incoming carry c i 1 , there will never be a new outgoing carry generated when calculating the final sum z i σ i · c i 1 . This is a result of Theorem 1, and the rules shown in Table 2 are in fact identical to the case where the operation under consideration is SD · SD SD, without format conversion.
We would like to point out that without the equal-weight grouping which results in "exporting" the polarity bits from the previous digit, any format conversion during addition becomes significantly more complex. It can be verified that without EWG the carry set needed becomes 2 1 0 1 , which is more complex than the EWG scheme. Worse yet, the look-back required to determine the carry-out at every redundant position is much longer since a carry of value 2 greatly complicates the rules (because 2 is not an allowed output digit). It can be shown that in this case a look-back of length 2k 1 radix-2 digit positions is sufficient to generate the correct intermediate sum and carry at each redundant output position. 
SD Addition without Format Conversion
The operation under consideration expresses its output in a fully redundant form. The two input operands, X and Y , as well as the output, Z, are in the conventional signed-digit format where the digit set is D´S Dµ 1 0 1 . Considering EWG digits, the four bits that contribute to the digit-wise sum of the operands, θ i , are x l i y l i each with a weight of ·1 and x h i 1 y h i 1 each with a weight of 1. As a result, θ i is in the range 2 ·2 . It can be verified that the carry set C´S Dµ 1 0 1 is sufficient in this case. The rules for this constant-time addition without format conversion are summarized in Table 3 . Table 3 : Rules for constant-time addition SD · SD SD. The symbol denotes the "OR" function. Table 3 shows the only allowable´c i σ i µ combinations for expressing the digit sum θ i ¾ 2 0 2 . There are multiple ways of expressing digit sum θ i ¾ 1 1 , and the rules in Table 3 are justified by the following observations. For θ i to equal 1, at least one of the polarity bits must equal 1. In this case, the carry-in satisfies c i 1 ¾ 0 1 as proved in Lemma 1 below. The EWG digit sum θ i can equal 1 in the following two ways. Table 3 can be thought of as simpler than the corresponding table(s) in other SD addition schemes proposed so far. For instance, there are more don't cares in Table 3 than in the corresponding table(s) from [16] and its derivatives. This may lead to a simplification of switching expressions and hence the implementation. The fundamental difference is that for schemes which do not use equal-weight grouping, it is necessary to look back at the previous digit position when θ i 1 as well as when θ i · 1.
SD3 ¦ Addition with Format Conversion
Two closely related types of redundant digit representations are considered in this section; SD3 and SD3 · . Again, this operation expresses its output in a partially redundant form. For SD3 each redundant digit is in the digit set D´S D3 µ 2 1 0 1 and for SD3 · the digit set D´S D3 · µ 1 0 1 2 is used.
In SD3 it can be verified that the carry set C´S D3 kµ 
Next, consider a position with redundant output which can assume any value in the range 2 ·1 . The rules to generate the intermediate sum and carry-out are summarized in Table 4 . From the third and fifth columns of the table, it is seen that σ i · c i 1 is always in the range 2 ·1 . This shows that the second constant-time addition step will never generate a carry when determining the final sum. Note that in this case, the carry set, C´S D3 kµ , and the destination digit set, D´S D3 µ , are identical. Therefore leaving behind an intermediate sum of σ i 0 is always safe. As mentioned earlier, if the source and destination digit set is D´S D3 · µ 1 0 1 2 instead of D´S D3 µ , the polarity bits should be assigned a positive weight and the magnitude bits a negative weight.
Once again, all four bits (two magnitude and two polarity bits) of the same weight can be grouped together so the digit sum, θ i , is in the range 2 ·2 . It can be verified that the carry set C´S D3 · kµ 2 1 0 1 is sufficient. The rules for addition at non-redundant positions are again summarized by (8) . The rules for a redundant position are similar to those in Table 4 and are omitted for the sake of brevity (please refer to the technical report [19] for details).
For both digit sets, if the equal-weight grouping method is not employed, the operation´SD3 ¦ · SD3 ¦ SD3 ¦ kµ requires a larger carry set and longer context than the corresponding case when equalweight grouping is employed.
SD3 ¦ Addition without Format Conversion
Again both SD3 and SD3 · will be considered for constant-time addition, but without any format conversion. First consider the digit set D´S D3 µ 2 1 0 1 . Like the SD3 ¦ k case, the digit sum, θ i , is in the range 2 ·2 . It can be shown that the carry set C´S D3 µ 
Here, 0 and ·1 are safe digits to leave behind as an intermediate sum. It is clear from (9) and (10) 
CS2 Addition with Format Conversion
This section considers CS2 constant-time addition with format conversion. The digit set at each redundant position is D´C S2µ 0 1 2 , and as mentioned earlier, the encoding prevents the bit combinatioń x h i x l i µ 1 1µ from occurring. The following lemma is essential in determining the carry set required for this case. 
Lemma 2. For EWG addition which uses the CS2 encoding, if x
Next consider a redundant output digit position which can determine the range of an incoming carry by examining the previous digit sum, θ i 1 . These rules are shown in Table 5 (a). The only apparent abnormality is that an intermediate sum of σ i 1 is allowed, which is not a valid final sum. However, this only occurs when a positive carry-in´c i 1 0µ is guaranteed, according to (11) . This is simply a matter of notation in order to make the table consistent with the relationship 2 ¡ c i · σ i θ i .
CS2 Addition without Format Conversion
Although the rules from Table 5 (a) for CS2 k addition at a redundant position apply when there is no format conversion, they are based on the assumption that the incoming carry comes from a non-redundant position. If the previous position is also redundant, it has a larger capacity which could limit the range of its carry-outs. This is possible if the following carry-relationship is satisfied, (13) is always satisfied, meaning the carry set C´C S2µ 0 1 is sufficient. Given this, the rules for CS2 addition without any format conversion can be simplified as shown in Table 5 (b).
Note that there is no need to look back at any previous digits, in other words, the look-back is L 0.
CS3 Addition with Format Conversion
Here, the redundant digits can take any value from the digit set D´C S3µ 0 1 2 3 . It can be verified that the carry set needed for CS3 addition with format conversion is C´C S3 kµ 0 1 2 3 . Given this carry set, the rules for CS3 addition with format conversion for a redundant position are shown in Table 6 (a). 
CS3 Addition without Format Conversion
Here, every output digit position is redundant and can assume any value in D´C S3µ 0 1 2 3 . Since 3 is an allowable digit, the carry-relationship θ i max · c i 1 max z i max · 2 ¡ c i max (15) simplifies to c i max 1 (assuming c i 1 max c i max ). This makes the carry set C´C S3µ 0 1 sufficient for CS3 addition without format conversion. The rules for determining σ i and c i are given in Table 6 (b). Again, they are stated only in terms of θ i , without any dependency on the previous group sum which makes the look-back L 0. Table 7 gives a summary of the look-back distances, L, and carry sets needed for the ten types of redundant binary addition considered. The table clearly shows that equal-weight grouping can lead to smaller carry sets and a smaller look-back. The longest carry propagation path increases with both the right context and the distance between redundant digits. Consequently, the smallest critical path delay of an implementation can be expected under the following conditions.
Comparison
(i) The look-back, L, is minimized.
(ii) The distance to the closest higher-order redundant digit is minimized, which happens when all output digits are redundant. Table 7 : Comparison of radix-2 constant-time addition techniques. Table 7 shows that the minimum look-back occurs only when the proposed equal-weight grouping is employed. Among the cases with zero look-back, those with the smallest carry set should be selected, since a smaller carry set usually implies less complex logic which should translate into smaller area and critical path delay. Applying these criteria, it is seen that the CS2 and CS3 representations (i.e., the carrysave representations) are more likely to result in better designs than the signed digit representations. When format conversions are considered, the minimally redundant (CS2 and SD) representations outperform their overly-redundant counterparts´CS3 SD3 ¦ µ in terms of the carry set needed.
Format conversions can be highly effective for Area ¢ Delay (A ¢ T ) efficient multiplier designs. For instance, in [20] it was shown that multipliers based on SD k with k 2 have a lower A ¢ T product than those based on the full SD representation of [16] . It turns out that adding partial products (which are in two's-complement format) to directly generate outputs in this SD k format is costly in terms of area and delay. A better approach is to add the partial products and generate outputs in SD format at the top level of the partial product accumulation tree. At the next level of the tree, the SD · SD SD k format conversion can be carried out during the addition.
Format conversions are also useful if there is a need to gradually introduce or remove redundancy in number representations. Note that by controlling the number and placement of the redundant digits, one can cover the entire spectrum of representations from two's complement (no redundant digits) to fully redundant (such as SD or CS where all digits are redundant). Table 7 compares the various representations at an abstract level, in terms of carry set size and lookback L. While this comparison can provide a good high-level assessment, actual VLSI implementations are necessary to gauge the relative merits and disadvantages of the various redundant representations. In the next section we present simulation results from the VLSI layouts of several adder cells.
Implementation
In order to verify some of the comparison results included in Table 7 , we designed, laid out and simulated adder cells for the following cases.
(i) SD · SD SD: The cell in [16] is the most efficient to the best of our knowledge, so we laid out this cell.
(ii) SD3 · SD3 SD3 : Shown in Figure 3 (iv) CS3 · CS3 CS3: This is nothing but a 4:2 compressor employed in conventional multipliers. The 4:2 compressor presented in [21] is extremely efficient and hence we laid out this compressor.
For the sake of brevity, the gate diagrams and details of cells (i) and (iv) are omitted, those can be found in the references cited. Cells (ii) and (iii) were newly designed and their gate diagrams are shown in Figure 3 (a) and Figure 3(b) , respectively. In both the figures, it is seen that the carry-out is generated based only on the bits of the current group, i.e., there is no look-back. Layouts of all four cells were simulated in the TSMC SCN025 0.25 micron technology process (available from MOSIS) with a 2.5 volt supply. The designs were first verified at the logic level. Berkeley SPICE 3f5 was used to estimate the critical path delay of each cell, which included appropriate fan-in and fan-out loading for all components. The results are summarized in Table 8 . The relative order of the simulated critical path delay agrees with the results from the cost estimate procedure described in [1] (excluding CS2 which [1] does not cover).
It should be noted that the SPICE simulation results are highly layout dependent. These layouts were done to get some idea of the relative comparison of the various redundant adder cells. The CS2 and SD3 cells in particular could be made more compact which might have a significant impact on the overall delay. In any case, the critical path simulations clearly demonstrate that the carry-save representations considered here lead to faster implementations. Table 8 : Critical path delay through one cell from SPICE simulations.
Discussion
The practical implication of the results presented above is that the CS3 representation along with the 4:2 compressor is the most efficient way to execute constant-time addition. In light of this, for a multiply operation it can be seen that using the CS3 representation with the compressor presented in [21] is likely to yield the fastest implementations (faster and smaller than those based on SD or CS2 representations using cells (i) and (iii) mentioned in Section 11). This can be inferred for the following reasons.
(a) Converting partial products from two's complement format to CS3 format is trivial; it requires no logic gates at all. Merely grouping the bits of the input operands appropriately leads to a valid CS3 representation of the output. For example, for input operands X and Y , grouping bit x i with bit y i 1 creates a valid CS3 digit.
In contrast, if the CS2 or conventional SD representation is employed, two's complement partial products must be added to generate outputs in their respective formats. In each of these cases, a small delay worth about one full adder is required to achieve this conversion [7, 16, 20] . In effect, multipliers based on CS2 or SD intermediate representations must endure an additional (albeit small) delay at the top level.
(b) The 4:2 compressor that performs CS3 · CS3 CS3 is smaller and faster than other cells.
These two factors, (a) and (b), together imply that multipliers based on CS3 can be expected to outperform multipliers based on other redundant representations.
There is a more fundamental reason for the superiority of the 4:2 compressor. Note that for both the SD and CS2 adder cells, the digit sums are from an input digit set of cardinality 5, that is
5. This corresponds to digit sums in the range 2 2 and 0 4 for SD and CS2, respectively. This is true regardless of whether or not EWG is employed for these representations. The cardinality of the output digit set in both cases is 3, since valid output digits are in the range 1 1 and 0 2 for SD and CS2, respectively. Thus redundant addition based on these representations converts an operand from an input digit set having cardinality 5 to a result from an output digit set of cardinality 3.
Note that in CS3 addition, after EWG, the digit sums are from an input digit set of cardinality 5 (digit sums in the range 0 4 ). The output is also in CS3 and therefore the cardinality of the output digit set is 4.
It is intuitively clear that converting an input digit set of cardinality 5 into an output digit set of cardinality 3 is a harder task than converting it into an output digit set with cardinality 4. Therefore, cells such as (i) and (iii) from Section 11 are fundamentally more complex, hence bigger and slower.
In fact Akoi et al. [22] recently proposed a clever method to employ a 4:2 compressor-like cell to execute constant-time SD addition by using the borrow-save encoding (x i x h i x l i ). In effect their method employs a 4:2 compressor to perform SD · SD SD3 , that is a digit set of cardinality 5 gets converted to digit set of cardinality 4. Since the SD3 output is a weighted encoding, EWG on the output is used to retrieve the original borrow-save encoding without any extra logic.
In closing, we show the relationship of this work to the results presented in [1] . The examples of constant-time addition without format conversion that we have described can be re-written using the notation from [1] as shown below.
The notation shows that the sum of digit sets to the right of the decomposition operator (´) are expressed using the digit sets to the left of the operator. A digit set δ ω is characterized by its diminished cardinality, δ, and negative offset from zero, ω. This represents digits in the range ω ω · δ and must include 0.
Further details regarding the notation and decompositions can be found in [1] .
The analysis in [1] requires that the total diminished cardinality to the left of the decomposition operator, δ out , be greater than or equal to the total diminished cardinality of the right side, δ in . The condition that δ out δ in is satisfied in all cases above except CS2, where δ out 4 and δ in 5. Therefore, CS2 addition presented in this paper lies outside the framework developed in [1] .
The hardware cost estimation approach from [1] can be applied to all cases except CS2, with the results shown in Table 9 , where Carry Generator is abbreviated CG, and Partial Adder PA. The number of Carry Generators listed below is the total number of "redundant" and "non-redundant" carry generators from [1] . As mentioned earlier, the ranking of the totals listed here agrees with our measurements shown in Table 8 . Table 9 : Cost estimates from [1] .
When constant-time addition with simultaneous format conversion is considered, the methodology from [1] cannot be applied to non-redundant positions of the SD k and CS2 k formats, since δ out δ in . Overall, we have presented some "non-obvious" cases of addition that the theory from [1] does not permit.
Conclusion
This paper presents a comprehensive analysis of constant-time addition and simultaneous format conversion, where the source and destination digit sets are based on binary redundant numbers. We investigated encodings that enable "equal-weight grouping" (EWG), wherein bits having the same weight are grouped together during the constant-time addition operation. The analysis and data show that EWG leads to smaller carry sets and context or look-back. These in turn lead to efficient implementations for constant-time addition and simultaneous format conversion of redundant numbers based on the carry-save´CSµ and signeddigit´SDµ representations. We compared VLSI implementations of various cells to perform constant-time addition and demonstrated that the conventional 4:2 compressor is the most efficient way to execute constant time-addition. Practical implications of this work are immediate and were illustrated via a comparison of multiplier implementations. We explored the fundamental issues underlying constant-time addition and indicated the reasons which render the 4:2 compressor the most efficient way to implement constant-time addition. We also presented some interesting connections to the results from [1] .
Possible future work includes finding redundancy metrics which capture the complexity of hardware implementations based on the redundant format under consideration without the need to go through VLSI implementations. Another issue is to extend the necessary and sufficient conditions for constant-time addition derived in [2] to the case where the digit sets at all digit positions are not identical. Such a framework allows for arbitrary spacing of redundant digit positions throughout a representation, as well as the ability to vary the types of redundant digits used. It is conceivable that examples of situations where both left and right contexts are required could arise in such cases. Since digit sets could be radically different from one digit position to the next, it is possible that each position would also need to examine its left context in order to select the appropriate or acceptable carry-out value. 
