filters and linear transforms, having an effective 18 to 16 bit precision, can be realized in hardware. Data throughput rates are comparable to those obtained by fully pipelined conventional architectures. Finally, an error model for the multiplier was derived and experimentally tested.
The developed autoscale multiplier may prove to be an important innovation in the areas of shift invariant following and the DFT using high-speed RNS arithmetic.
[4] W. J. Jenkins In their paper Chen and Willonerl describe a parallel multiplier with bit-sequential input and output, least significant bits first. A possible implementation of the multiplier module without control logic is given in their paper.
It is not mentioned that the additional control logic is rather complex. In order to allow pipelining, each module needs a set of 5 storage cells to hold the A and B inputs and S, C1, and C2 outputs.
A possible complete implementation is given in Fig. 1 However, it can be seen that once a bit of the product has been calculated the module remains stable. Instead of shifting the $A'(j -1), B'(j -1)j bit pair to the next module, the same module can be reused, if the previously produced product bit is outputted to the next sequential device. This conforms to the bit-sequential nature of the multiplier.
In Fig. 2 the different arrangement of the modules is shown.
The five input bits generated for thejth module during the ith iteration of the multiplication algorithm are as follows:
Instead of 2n modules only n modules are required.
The full operation of the multiplier is illustrated in Fig. 3 , which is the same example as in Chen and Willoner's paper. If the product does not exceed the n bits, only n/2 modules are required. In Fig. 4 Fig. 4 has a smaller number of gates than the module implementation in Fig. 1 . Also, only n + 1 select-lines ai are required. In our earlier paper' we proposed an algorithm S for error-cor- cannot be applied for iB and ic, where Ch(iA) = iB and Cx(iA) = iB mean that iB is a child of iA and that AB -CB or BA -BC, respectively. iA > iB means that iA is an ancestor of iB.
III. THE DETECTION OF AN INFINITE LooP
Use the symbol "-c"' instead of "-" to distinguish from an ordinary derivation, and call it C-derivation. C-derivation is defined by the following. To summarize, the advantages of this approach are as follows: . only n modules are required to produce the 2n-bit product of two n-bit operands, * there is only one product output, * the interconnection scheme in the n-module case is more simple than in the 2n-module case. The n-modules have only nearest neighbor interconnections.
= AI(i)r* IA (i,) ( 
