We introduce implementations of arithmetic operators based on the binary stored-carry-or-borrow (BSCB) 
Introduction
Redundant number representations allow fast addition by eliminating the carry propagation chains [Aviz61] . Over the years, many researchers have studied and improved upon redundant representations [Parh90] , [Phat94] , [Taka02] . Redundant representations are used extensively in adder and multiplier implementations, especially those based on the carry-save or borrow-save form. As an example, we cite a multiplication algorithm [Taka85] based on a redundant number expression in radix 2 and the digit set {-1, 0, 1}. Here, we use a redundant radix-2 "binary stored-carry-orborrow" (BSCB) expression with the digit set {-1, 0, 1, 2}. Ours is similar to the SD3 (+) format of Phatak et al. [Phat01] , but with a different encoding of the digit values.
Elsewhere [Torn09] , we introduced half-, full-, ripple-carry, and carry-lookahead adders for the BSCB representation, with a main characteristic of having rather simple Reed-Muller realizations. Some such designs are discussed here, along with a similarly motivated array multiplier. Thanks to their regular structure, array multipliers are VLSI-friendly. The computation delay is proportional to operand widths, but this drawback is counterbalanced by efficient support for pipelining. Multiplier implementations using the carry-save representation have been described in textbooks for at least three decades [Hwan79] , [Parh10] . A BSCB array multiplier could be trivially implemented by using BSCB full-adder cells, but the method described here is more efficient. First we show that the matrix of partial products, generated by the AND operation over the multiplier and multiplicand, can be transformed into another matrix of BSCB numbers. Then, we describe the accumulation process, which is similar to the standard one, but with the results expressed in BSCB form. Finally, we deduce Boolean equations for the output signals of the accumulator cell, obtaining a rather direct implementation using XOR and AND gates with the transformation of the initial AND matrix into an XOR matrix. As an example of our scheme, we describe the detailed design of a 5 5 array multiplier, along with its performance, complexity, and potential advantages.
BSCB Addition
Let capital letters denote integer variables and lowercase letters stand for Boolean variables. Let S = A + B + C, with: 
A BSCB expression of a three-operand sum can easily be deduced from the carry-save expression. Let the sum S n be encoded by the Boolean variables r n+1 and u n as follows:
S n -1 0 1 2 r n+1 u n 11 00 01 10
The BSCB representation expresses the addition result in the range [-1, 2], a combination of the ranges for carry-save and borrow-save representations.
The Boolean expressions for partial sum n s and carry n cy for a standard carry-save full-adder are: Carry-save addition: By assuming that for each position n, a carry was generated at position n -1, the partial sum u n is complemented and the "carry" signal r n+1 must have a positive action or a negative one in case the assumption was wrong. BSCB addition: 
BSCB Mutiplication
Let the multiplier (X) and multiplicand (Y) be N bits wide:
Let the product be a 2N-bit binary number denoted by P:
Assuming the initial AND matrix is computed (see the upper part of Fig. 3 ), we consider an area of bits set to the value 1 (gray shaded) and generated by the product of bits x k to x k+n-1 with bits y l to y l+m-1 .
The summation of this area A gives following result:
The upper matrix can be transformed into the lower matrix, where the considered area is represented by two rows in the lower matrix, the other rows being zero. The first row is comprised of 3 areas A 1 , A 2 and A 3 having respective sums: The summation of areas A 1 to A 6 gives the same value A, as justified in the following: It should be noted that the resulting matrix includes one row and one column more than the initial matrix.
We intend to compute the summation of the transformed matrix (named E) by using the BSCB digit set {-1, 0, 1, 2}. Carry-free addition of two BSCB numbers is known to be impossible. We thus consider the addition of accumulated values expressed in the digit set {-1, 0, 1, 2} with values of the matrix E expressed in the digit set {-1, 0, 1}. In matrix E of Fig. 4 , three different areas are added: area A, area B (gray shaded), and area C. The two others at the extreme left and right sides (left blank) have always a zero sum. The three lines SumA, SumB, SumC show the summation result of the three respective areas. The SumA (respectively, SumC) line is composed of values between 0 and 1 (0 and -1); these two cases will be treated further. We now focus on the summation process for SumB. Let E i be the ith row of the transformed matrix and let L i be the ith iteration of the accumulated sum. The initial accumulated sum L 0 is initialized with the value E 0 of the first row of the transformed matrix. We introduce a transfer digit T i from the set {-1, 0, 1} whose value is related to L i : a carry (respectively, borrow) is propagated to the next stage n + 1 if the accumulated value is 2 (respectively, -1). The resulting accumulated sum of iteration i + 1 is defined by: 1
L L T T E
The matrix E is not composed of random values, but of values related to x and y values, thus resulting in a regular structure of values within the digit set {-1, 0, 1}. According to the E values we examine how the accumulated sum is computed. Table I is transformed into the Karnaugh map of Table II (resp., III). Propagation delay for an implementation without pipelining: The propagation delay of the ACC cell is equivalent to two XOR gates, thus identical to the propagation delay of a standard full-adder (also two XOR gates for the worst case path depending on the technology and the design). An N-bit standard array multiplier is realized with (N -1) 2 full-adders and the propagation delay from the partial product to final stage (RCA or CLA) is equivalent to 2(N -1) gates.The propagation delay of the BSCB array multiplier is equivalent to 2N gates, that is, 2 gates more than the standard array multiplier whatever the operand widths. Propagation delay for an implementation with pipelining: The BSCB array multiplier is suitable for pipelined VLSI implementation and should allow an equivalent or better throughput over the standard one, given that the propagation delay of the standard full-adder is two or three gates (worst cases: two XOR gates or one XOR gate and two NAND gates, depending on the technology) instead of two XOR gates for the BSCB ACC cell. A 5 5 array multiplier is shown in Fig. 6 . In spite of the overhead due to the extra row and diagonals of ACC cells, beginning with operand width of 16 bits, the BSCB array multiplier needs fewer gates (Table IV) , because the ACC cell is realized with 4 gates instead of 5 gates for the standard full-adder (assuming that only 2 inputs gates are used and that all kinds of gates have the same complexity in term of area). Testability of the cells is improved by the use of XOR and AND gates. For all cells (INIT, ACC, RCA) we obtain a 100% coverage for single stuck-at faults with 4 test vectors. 
Conclusion
New implementations of binary adders were proposed based on the binary stored-carry-or-borrow representation. It seems that this BSCB representation leads rather directly to ReedMuller implementations of all known types of adders (half-, full-, ripple-carry, carry-lookahead adders). A direct advantage of this kind of implementation is that the testability is improved over known implementations due to the use of XOR gates, as demonstrated by prior work. A related drawback is that the rise in switching activity leads to an increase in power consumption.
A number of areas merit further exploration. One is the derivation of new designs of XOR gates. A second area is that of incorporating fault tolerance features in the adder designs. For example, carry-lookahead adders might be checked through parity prediction. It is evident that the parities of the two inputs can be used to generate the parities of the signals generated by the first XOR stage in the adder. A third area for further investigation is implementation with reversible logic, perhaps using the method of parity preservation [Parh06] for fault tolerance. These adders are particularly suitable for reversible-logic implementation due to the fact that more than 3/4 of their circuits consist of XOR gates.
We also presented an array multiplier based on the BSCB representation. Like ordinary array multipliers, our design is VLSI-friendly and easily pipelined, with the added benefit of simpler cells and lower overall latency. The multiplication process is very similar to the process of the standard array multiplier. Accumulation operations are performed without carry propagation, but all intermediate accumulated sums are expressed in the BSCB redundant representation instead of the carry-save representation. Our architecture shows two main characteristics: (1) The standard initial AND matrix is replaced with an XOR matrix, and (2) A recurrence relation between the XOR products and the partial accumulated sums makes the implementation simpler and more reliable. 
