Abstract-
I. BACKGROUND
The radix 4 recoding procedure utilizes Booth Example 1: Shows how Q = 1100 01011000 1011 2 is recoded to be represented by P = 1 1 012 2 1 1 1
The use of Booth radix-4 recoding for 16-bit integer multiplication for Q × P requires 88 entries and 9 rows as illustrated in Figure 2 for the n-bit product. This is a considerable reduction of the 136 entries and 16 rows for the radix-2 integer partial product array but provides no additional benefit for the squaring operation. A radix-2 squaring circuit was described in [PBD97] resulting in 72 entries as illustrated in The radix-4 dual recoded squaring algorithm determines the Booth digits in a right-to-left manner i.e. starting from the least significant bit and moving toward the most significant bit. Let P i be the integer formed by shifting the radix 4 digit string right i places deleting the low order i digits obtaining
, we obtain the following.
Observation 1: Recall that q 2i+1 is effectively the sign bit of the recoded digit d i , so we obtain the partial square identity 8 Q 2i+2 + q 2i+2
It is important to observe that the 2's complement of Q 2i+2 + q 2i+1 reduces to the sign extended 1's complement of Q 2i+2 , as formally summarized in the following. 
IV. RESULTS
The radix-4 dual recoded squaring circuit and a general purpose multiplier were both implemented in verilog and mapped to OSU standard cell library [SCWH07] . Both circuits were constrained to run with-in a 20ns clock-edge and were implemented for 16, 32, and 64 bit-widths. The charts in Figures 7-9 show a substantial gain in power, leakage power, and area for our customized squaring circuit compared to a multiplier circuit. 
ACKNOWLEDGMENT
The preferred spelling of the word "acknowledgment" in America is without an "e" after the "g." Try to avoid the stilted expression, "One of us (R. B. G.) thanks …" Instead, try "R.B.G. thanks …" Put sponsor acknowledgments in the unnumbered footnotes on the first page.
