This paper presents a novel approach for theoretical estimation of power consumption in digital binary adders. Closed-form expressions for power consumption of four different types of binary adders -the ripple-carry adder, the Manchester adder, a multiplexor-based carry-select adder and an efficient tree-based look-ahead adderare derived in terms of word-length and pre-computed technologyspecific energy parameters. These expressions are verified to be accurate to within 1 -5% by simulation using the HEAT tool.
INTRODUCTION
This paper proposes a novel theoretical approach for estimation of power dissipation of four different types of binary adders: the ripplecarry adder, the Manchester adder, a multiplexor-based carry-select adder and an efficient tree-based look-ahead adder. Although power consumption of binary adders has been compared by simulations [lo] , no theoretical method for estimation of power consumption has been presented thus far. To the best of the authors' knowledge, the proposed approach is the first systematic technique for theoretical estimation of power consumption in binary adders.
In this process, power consumption formulations are expressed in terms of the word length and technology-dependent energy parameters. As a first step, component cells are identified and characterized according to input and output transitions e.g., in the case of the ripple-carry adder, carry and sum transitions in the context of full-adder cells present a convenient level of abstraction. The energies associated with these transitions are extracted using SPICE. The analytical aspect of this method proceeds with the determination of probabilities related to the propagation or termination of transition stimuli.
The derivation of these solutions develops from a temporallydiscretized model employing element-level abstraction. The formulations are arranged as either closed-form expressions or trivial one-loop procedures. Yet, they attain a level of accuracy approaching that of full-scale simulation. Computational requirements are dramatically minimized with this methodology. Therefore, these formulations are indicated in lieu of simulation whenever the binary adder is selected as the principal target of optimization.
SPECIFICATION OF THE MODEL
The convenient formulations produced by this method are derived from a general model which employs element-level abstraction to avoid technology-specific detail. The only acknowledgment of the underlying technology comes in the form of a series of trivial simulations from which the energies accompanying particular transitions are extracted. Transitions with homogeneous energy consumption and I/O traits are amalgamated into transition classes. The energies associated with these classes appear directly in the final expressions.
Discretizing the computation period imposes conceptual order on the carry-propagation process by aligning switching activity with well-defined temporal references. To abate complexity, distinct phases of operation in which elements share similar properties are determined.
Each structure is statistically scrutinized to obtain interdependent transition probabilities for each unit. In order to mitigate complexity, synchronous switching and a uniform distribution of signal levels are assumed at the inputs of the structure. These requirements do not diminish the validity of relative comparisons of architectures that do not conform to these assumptions.
The hallmark process of the binary adder is propagation. Understanding of it is aided by three conceptual devices. The first -
the hold-state cell -is defined as any cell which blocks the transmission of switching activity from the preceding to the succeeding bit position. By contrast, the curry-state cell always relays such activity. Finally, a curry chain is formed by the advent of a contiguous series of carry-state cells. In addition, it will be necessary to distinguish between carry-state and hold-state cells in the current computation period and the vestigial carry-state and hold-state cells from the previous period whose propagation-stabilized outputs are extant at the beginning of the new period.
RIPPLECARRY ADDER
Of all binary adder variants, none is more evocative of the leastsignificant to most-significant sequential pen-and-paper tallying process than the ripple-carry adder illustrated in Figure 1 . The fulladder cell is considered elemental for this architecture. It is found that transition occurrences on this element can be categorized into four classes expending energies of ec, es, ecs and e,, where the subscript, with the exception of n which refers to the case with no output transition, denotes the output(s) which switched. The ripple-carry computation period can be decomposed into two phases: the generation phase and the propagation phase. The one-time-slot generation phase is initiated immediately upon the start of a new computation period, while the propagation phase spans the remainder of the computation period. In the generation phase either or both of the extemal z and y inputs may change while the 0-7803-4455-3/98/$10.00 0 1998 IEEE ci, inputs remain stable. The propagation phase, however, permits only the cZn inputs to vary. The least-significant cell is an exception, since all three inputs may change in the generation phase and none thereafter.
Consider the addition of two W-bit words. In the generation phase, the probability of each of the four transition types occurring at the least-significant cell position given a uniform distribution on the inputs can be found by inspection of the full-adder truth By inspection of the abbreviated set of permissible transmissions associated with an invariant carry input, the energy expended in the remaining W -1 cells is found to be In the propagation phase, hold-state cells as well as the leastsignificant cell generate a switching disturbance with probability $, For carry-state cells, a switching disturbance will only be instigated if the vestigial cell in the same position assumed the holdstate. Overall, the probability of switching is +. It can be demonstrated that only transitions with associated energy ecs and e, can occur in carry-state cells and hold-state cells, respectively.
Given the switching probabilities above and the understanding that input toggling will be transmitted to the terminus of the chain, it follows directly that a carry chain of length 2) preceded by a holdstate or least-significant cell must induce
switching instances on the cell succeeding the chain. Recalling the allowed transition energies and accounting for carry chains ranging from length 0 t o p -2, for every active position, p, the total energy over all active cells is found to be Summing the generation and propagation phase expressions results in the complete expression for the ripple carry adder, which is Table 1 .
MANCHESTER ADDER
The Manchester adder illustrated in Figure 2 is unique among the fast adders in the sense that it derives its speed advantage from performance improvements to its constituent cells, rather than a more efficient interconnection scheme. The hardware can be divided into three stages: the PG-cell stage, the Manchester-cell stage and the sum-cell stage. In the PG-cell stage, energies e p g , e p , e, and e, corresponding to changes on both, one, or none of the outputs are identified. In the Manchester stage, energies emcp, emp, emc and em, represent the dissipations associated with changes on thep input and the c output in a manner consistent with the previous notational schemes. Finally, the sum-cell stage is described by a single energy, es, associated with any change on the inputs. Due to similarities between this architecture and the previous, the propagation phase energy can be summarily written by replacing the transition-energy parameters in the ripple-carry result of the same phase. This leads to an energy of (emc + em, + 2 e , ) . Following the practice outlined in the previous section, simulations were performed. Agreement to within better than 9% was observed as shown in Table 2 . 
CARRY-SELECT ADDER
A uniquely efficient implementation of the carry-select constructed entirely of multiplexors can be formed by exploiting the intrinsic degeneracy in the truth table of a full adder. The structure is illustrated in Figure 3 , where four stages are visible. The details of this structure can be found in [ 121 and [7] . The "blocking" or propagationisolation tactic employed by this scheme is typical of fast adders and will provide a convenient perspective for this analysis.
To begin, the re-mapping stage, which re-assigns a redundant input-output association can be verified by inspection to consume energy where Wb is the block length and E,"=, Wb = w. Here the subscripts of the energy parameters indicate the number of toggled outputs.
Contingency Stage
This structure achieves performance gains by pre-calculating alternative propagation chains. Although each chain is functionally equivalent to the ripple-carry process, the distribution on carry input of the least-significant cell differs appreciably. In the generation phase, this results in a dissipation of
where the three energy parameters represent the energy of an output transition only, an output transition accompanying a select-line change and a select-line change alone, respectively. In the propagation phase, the total energy consumption amounts to Thus, for both chains, the total required energy is, on average, 
Discriminator and Sum Stages
The third stage of the carry-select structure is responsible for selecting between the alternative outputs of the previous stage. Based on the pair-wise characterization of those outputs it can be shown that the number of transitions at position p is where and z is defined by the recursive relation
11-455
Determining the switching activity of the first block as 
The figures produced by this formulation are consistent with simulation to within better than 5% as demonstrated in Table 3 . 
DGBADDER
The basis of the fastest of the fast adders is the provably optimal binary-tree connection paradigm. The Brent-Kung adder [ 11, the Montoye adder [9] , and the Dozza, Gaddoni and Baccarani (DGB) adder [4] which is analyzed here represent the three extremes of tree-based look-ahead adder design, with the first minimizing fanout and component count, the second achieving both low latency and component count and the final design accomplishing the simultaneous optimization of fan-out and latency.
generate (PG) logic, the historically-titled "0"-operator network, and the sum logic. In the PG stage it is found by inspection that the energy expended in this stage is Three stages of this design are apparent in Figure 4 : the propagate- where eN for N E (0, 1 , 2 } represents the energy dissipated when n outputs make a transition and e b denotes the energy consumed by a buffer. The four-input "0"-operators produce switching energies of e p , e,, epg and e,, where subscripts p , g and pg signify transitions on the outputs with the same designation and n denotes a lack of variation on both outputs. The three-input operators dissipate energy e, when an output change is induced and energy enc when the output does not react. Since the PG cells dwell in the state with a high level on the g output with probability i, the probability of a high level at depth k in the "0"-operator network can be shown to obey the relation Consequently, the probabilities of transition events on the four-input cells can be written in terms of P,("') as shown in Table 4 where the subscript and superscript have been omitted for simplicity. Threeinput cells consume energy e, with probability fr and enc with probability 2P -3P2.
Multiplying by the number of elements of each kind at every level of iteration and summing over all levels produces the expression of the energy required on average by the "0"-operator network as
11-456
The total energy expended in the sum stage can be written by inspection as where e , and ens indicate the energies corresponding to transitions on the sum-only adder cells, while eszLm, ecout, ecs and encs refer to the associated energies of full-adder-cell transitions.
agreement. Theoretical and experimental data are compared in Table 5.
The formulation for this adder also demonstrates a high-precision Table 5 . Power consumption of DGB adder.
I Length 11 Theory (pW) I Simulation (pW) I Error I
