Abstract-In this paper, we present an analysis for the noise due to finite word length effects for digital signal power processors using Welch's power spectrum estimation technique to measure the power of Gaussian random signals over a frequency band of interest. The input of the digital signal processor contains a finite-length time interval in which the true Gaussian signal is contaminated by Gaussian noise. We analytically derive the roundoff noise-to-signal ratio in the measurement of the signal power. We also present computer simulations which validate the analytical results. These results can be used in tradeoff studies 
I. INTRODUCTION
I N Chi, Long, and Li [ 11, we presented a statistical analysis for the accuracies in the radar backscatter measurements for the NASA Scatterometer (NSCAT) which utilizes a digital Doppler processor (DDP). The reader is referred to [1] for details of the DDP design and general features associated with NSCAT. Briefly, the DDP is a fast Fourier transform (FFT)-based digital signal processor that performs a Welch's power spectrum estimation on the radar return signal (see Fig. 1 ).
We would like to present the DDP in a more general manner such that its application is not limited to the radar area. Therefore, we will refer to it as a digital signal power processor (DSPP) instead. In [ 11, the normalized standard deviation of the measurements by the DSPP (the so-called Kp parameter) was derived. However, that derivation assumed that the roundoff noise of the DSPP was negligible. In practical design, roundoff noise is always present and can potentially contaminate the true signal power measurements. In this paper, a roundoff noise analysis of the DSPP is presented. The derived analytical results have been used in tradeoff studies of hardware design such as the number of bits required at each stage of processing to keep the roundoff noise-to-signal ratio satisfactorily low. Although this derivation was motivated by radar system design, we believe the approach and the results will be helpful in the design of other digital signal processors for estimating the power of random signals.
In Section 11, the DSPP based on the Welch's power spectrum estimation and the associated equation for evaluating Kp are summarized. The roundoff noise model that we have used for the DSPP is presented in Section 111. The roundoff noise-to-signal ratio is derived in Section IV, and some computer simulations to validate our theoretical results are shown in Section V. We discuss how to use the derived results in the design of the DSPP and draw some conclusions in Section VI.
A DSPP BASED ON WELCH'S POWER SPECTRUM ESTIMATION
The DSPP is designed to compute an unbiased estimate of the signal power over a frequency band of interest. Essentially, the signal processing procedure employed in the DSPP consists of power spectrum density (psd) estimation and signal power computation. The DSPP based on Welch's power spectrum estimation is shown in Fig. 1 . It consists of a) computation of the FFT, b) application of a window by convolution, c) squaring for power detection, and d) computation of the signal power. The input of the DSPP consists of a finite-time-length signal is contaminated by noise. The Welch's power spectrum estimate of the input is first determined, and then the signalplus-noise power is obtained by summing the estimated periodogram over the desired frequency range. Similarly, an estimate of the noise-only power is computed. The 0096-3518/87/0600-0784$01 .OO 0 1987 IEEE final estimate of the signal power is obtained by linearly combining these two measurements.
Assume that we are given Nps independent signal-plusnoise pulses and NpN noise-only pulses. For each independent pulse, we assume the following.
(Al) For the noise-only case, the input signal x ( t )
consists of the thermal noise n ( t ) which is assumed to be zero-mean Gaussian with psd In [l] , we developed an unbiased estimate pR of PR based on the assumption (Al) and the additional assumption (A2) The window spectrum Ws ( k ) is very narrow. By simply extending the result in [l] for multiple independent measurement pulses, the unbiased estimate PR is given by
PR BS
where and NPN KN knN where * indicates circular convolution, and C 1 is associated with the power of the signal-plus-noise case, and C2 is associated with the power of the noise-only case. All the parameters in (3), (4), and ( 5 ) are described in Table  I (also see [l] ). Note that XI" ( k ) in (4) indicates the Fourier transform of the ith data segment x I j ) ( n ) associated with the jth pulse for the signal-plus-noise case. X $ j ) ( k ) in (5) is then associated with the noise-only case. The ith case segment associated with an input digital signal x ( n is defined as 
(6)
Throughout this paper we use X ( k) to denote the Fourier transform of x ( n ) . The normalized standard deviation Kp of pR is defined as { ~a r
[pR] Kp = PR which has been shown in [l] to be 2 where all the parameters are described in Table I . A practical consideration in the implementation of a digital signal processor is the limitation in the word length at each stage of processing. The power of roundoff noise in the output due to finite word length effects depends on the realization of the digital signal processor. The realization will be driven by the dynamic range at each stage, computation speed limitations, etc. For example, the design of the DDP in NSCAT requires that the digital processor have a dynamic range to process input signal with the SNR range from -20 dB to +26 dB. The SNR range in the input signal determines the dynamic range at each stage of the digital processor, and hence, the number of bits at each stage to prevent overflow. In order to minimize the number of bits at each stage, some of the least significant bits would have to be discarded between stages. The roundoff noise accumulated in the output will increase as a function of the discarded bits.
Another practical consideration is the use of fixed-point arithmetic. When all processing is done with fixed-point arithmetic, bit-shifting may be performed at each stage to keep the digital number within a proper dynamic range such that overflow and underflow are minimized. One bit shifted to the left is equivalent to multiplication by 2. One bit shifted to the right is equivalent to multiplication by 1 / 2 or the truncation of a least significant bit. During computation of the FFT with radix 2 which is used in the presented DSPP in this paper, a bit-shift right before each stage of the FFT may be performed in order to prevent FFT overflow. Thus, the scale factor 1 / M will be included within FFT. Also, other artificial scale factors may be introduced to avoid overflow or to simplify the hardware design.
In implementing the DSPP, a number of digital signal processing techniques can be used to minimize the number of arithmetic operations. For example, when computing FFT's of real signals we can perform two FFT computations simultaneously as follows. The Fourier transforms X1 ( k ) and X , ( k ) of two real signals x1 ( n ) and x2(n) can be obtained by
where M is the number of the FFT points and F ( k ) is the Fourier transform of the complex signal
(10)
We refer to the execution of (8) and (9) as FFT decomposition.
With the above considerations, a schematic diagram of DSPP signal flow is shown in Fig. 2 . The scale factor Zbl/ C, is chosen such that overflow probability is small during A/D conversion where bl is the number of the bits of the A/D excluding the sign bit and C, is the clipping level of the A/D. The output of the A/D is left-shifted b bits to become the input of the FFT. Let b2 be the number of bits of FFT. Then bl + b must be no more than b2 to prevent overffow in the input of the FFT. As previously mentioned, a scale factor 1 / M will be used within the FFT for preventing FFT overflow. The scale factors Zd' and 2dz are used for simplifying the complexity of the FFT decomposition and the window weight setting. The I least significant bits of the periodogram are dropped prior to the final triple summation in order to minimize the number of bits in the output of the triple summation. Only a fixed number of bits of the output of the triple summation are kept by dropping "a" bits. Of course, all the scale factors must be compensated for when computing pR.
Note that the compensation factor A = ~:21-2b1-2b-2d1-2d2 (11) is used in the computation of j R (see Fig. 2 ).
In this paper, we assume that the dynamic range in each stage is large enough such that the overflow is negligible. Fixed-point arithmetic is assumed.
ROUNDOFF NOISE MODELING ASSUMPTIONS
Quantization errors occur when an analog signal is converted to a digital representation with a fixed word length. Errors will also be generated during arithmetic operations on finite-length digital numbers. Right bit-shifting operations, which are equivalent to truncation, can also result in errors. We have called all these errors roundoff noise. For fixed-point arithmetic, only multiplication of two digital numbers and truncation of the least significant bits produce errors. The roundoff noise due to multiplication operation can be eliminated at the expense of increasing the word length after multiplication.
In this paper we assume the signal path from the A/D input to the output of the triple summation is implemented by a special-purpose digital processor. In NSCAT, this processor will be on board a spacecraft. The roundoff noise generated in this processor must be accounted for in the design when determining processor performance. However, roundoff noise generated in the signal processing after the triple summation is assumed negligible. For NSCAT, this processing will be performed on ground.
In Fig. 3 , we show the roundoff noise sources occurring in the model shown in Fig. 2 , el and e; are generated by the A/D conversion. No error is generated by the scale factor 2b (left bit-shift by b bits). e2 and e4 are generated by the FFT computation. e3 and e; are generated by the decomposition of the FFT output. e4 and e4 are generated by windowing. e5 and e; are generated by the squaring operation. e6 is generated by the presummation scale factor (truncation of 1 least significant bits). e7 is generated by the postsummation scale factor (truncation of "a" least significant bits). Notice that e2 and e;, generated during FFT computation, and e4 and e;, generated in performing convolution, consist of many arithmetic steps.
In Fig. 3 , we denote the actual digital signal at each
ideal digital signals associated with yi ( k ) and yf ( k ) , respectively. The resulting roundoff noise, hi ( k ) or hf ( k ) , propagated from the previous stages, is then 
We can easily eliminate some noise sources by the appropriate choice of scale factors. We make e3 = e; = 0 by choosing dl = 1 [see (8) and (9)]. We can make e5 = e; = 0 by increasing the word length of the output of the squaring operation.
We now make the following assumptions about all the roundoff noises.
(A3) e3 = 0, e; = 0, e5 = 0, and e; = 0. (A4) All the noise sources are independent of one another and are independent of the signal.
(A5) The accumulated roundoff noises h4 ( k ) and hi ( k ) at the input to the squaring operation are Gaussian.
Assumption (A4) is generally used in the roundoff noise analysis of digital signal processors. Assumption (A5) is made to simplify the following derivation. From Fig. 3 , one can easily see that all signal processing procedures are linear except for the squaring operation. The derivation of second-order statistics of the output of the squaring block requires fourth-order statistics of its input. This will make the derivation extremely difficult without the last assumption. We also note that by the central-limit theorem, the last assumption is reasonable.
IV. ROUNDOFF NOISE-TO-SIGNAL RATIO V From Fig. 3 , we can see that the signal processing procedure from the input to y4 ( k ) or y4 ( k ) ' is linear and that the signal processing procedure after y5 ( k ) or y5 ( k ) ' is also linear. The only nonlinear operation is the square which makes the roundoff noise h5 ( k ) accumulated in y5 ( k ) correlated to the true signal. Therefore, h5 ( k ) is correlated between different data segments because the true signals in y5 ( k ) are correlated with one another due to the time overlapping processing of the input signal. So is hi ( k ) . For notational simplicity, we neglect the indexes associated with the data segment number for the signal path from the input through y4 ( k ) . From y5 ( k ) on we will attach the data segment number -to all the quantities by a superscript ( i ). The derivation for the roundoff noise-to-signal ratio consists of the following 5-step procedure:
Step A: mean and covariance function of h4 ( k ) and
Step B: mean and covariance function of h5 ( k ) and
Step C: mean and variance of h7;
Step D: mean and variance of roundoff noise in pR; and
Step E: roundoff noise-to-signal ratio I/.
This 5-step derivation procedure provides a set of equations. These equations have to be sequentially computed to obtain I/. In each step we present the derivations and results as concisely as possible with detailed proofs shown in Appendix A. The following derivation is for the signalplus-noise case. Results for the noise-only case can be similarly obtained. We use mi and o? to denote the mean and variance of ei , respectively. Note that ei and ef have the same mean and variance. Let q5i ( k ) and 4; ( k ) be the covariance function of hi ( k ) and hf ( k ) , respectively.
Step A: Mean and covariance function of h4 ( k ) and 
Based on our assumptions, it can be easily shown that 
is not. Also note that h4 ( k ) is uncorrelated with hi ( k ) although they have the same covariance function.
Step B: Mean and covariance function of hy) ( k j and h i ( i ) ( k ) .
Note that
( 2 5 )
Since E [ f f ' ( k ) ] = 0 (see [l] ), then
E [ h g ' ( k ) ] = E[h:(k)] = &(o) + E 2 [ h 4 ( k ) ] . ( 2 6 )
Note that the superscript ( i ) associated with h 4 ( k ) is omitted because its mean and covariance are independent of data segment index i. Similarly, we have
/~$ (~) ( k ) = ( h i ( " ) ( k ) f + 2 h i ( r ' ( k ) f i ( r ) ( k ) (27) and

E [ h ; " ) ( k ) ] = E [ h i 2 ( k ) ] = +i(O) + E 2 [ h i ( k ) ] . ( 2 8 )
Next, we derive the covariances Cov { h t ' ( k l ) , h r ) ( k 2 ) } , Cov { h i ( ' ) ( k 1 ) , /~; (~) ( k~) } , and Cov { h f ) ( k l ) , h p ( k 2 ) ] .
One can easily see that
because they are associated with the different data seg-and ments which are processed separately. Similarly, one can Step D: Mean and variance of roundoff noise in pR.
Let eSN = h for signal-plus-noise case and eN = h for the noise-only case. Then the roundoff noise E , embedded in pR, is given by where as and aN are the values of "a" associated with eSN and eN, respectively. Therefore, and
Step E: Roundoff noise-to-signal ratio V. where Kpe can be thought of as the normalized root mean square value due to the effects of roundoff noise.
V. A DESIGN EXAMPLE
In this section, we illustrate the computation of roundoff noise-to-signal ratio V and compare the theoretical results to simulation results using an example associated with the design of the DDP in NSCAT.
Practical considerations and performance requirements dictate the following choice of parameters for the NSCAT DDP design: For this case, e , ( k ) = 0 because no multiplications are needed during windowing. Only left bit-shifting and addition are performed. For this example, some intermediate equations in the previous section can be simplified. In Appendix B, we list these intermediate equations and briefly discuss the mi and a; to be used in computing the roundoff noise-to-signal ratio V .
Let X, be the output signal of the triple summation block before truncating "a" bits in Fig. 3 . The dynamic range of X, for the signal-plus-noise case is much larger than that for the noise-only case. Therefore, it is judicious to make "a" adaptive. The following approach for determing a is assumed.
Let 4 be the number of bits of X,, Le.,
where [ x ] ' denotes the smallest integer larger thanx. The value a is determined as follows:
where n,,, is the maximum number of bits for y7 after "a" bits truncation, and amin is the minimum value of " a . " Note that "a" is a function of 4. Let
One can also show that Var'/2 [X, ] / E [ X, 3 << 1 /2, which implies 4 will take the value q+ with very high probability. The value "a" for the theoretical calculation is chosen to be the value for 4 = q+ in (66). We performed computer simulations for this example to validate the theoretical results derived in the previous section. The realization of the DDP in Fig. 2 with finite word length was simulated by a digital computer. The ideal DDP with infinite word length cannot be simulated with a finite-word-length digital computer. Therefore, the "pseudoideal" DDP was simulated by a digital computer with floating-point arithmetic. Of course, all the digital numbers in the "pseudoideal" DDP have many more bits than in the realization of the DDP with finite word length. We then generated a set of Gaussian random sequences as input to the simulated DDP to get as well as to input to a "pseudoideal" DDP to get pR. We then computed the statistical mean square value of the roundoff error E = -pR, and then use (61) and (62) to calculate the roundoff noise-to-signal ratio V. In the following simulations, kN = ks = 32, NpN = 4, and n , , = 12. Only the parameters b, bl, Nps, E, and amin were varied. Table I1 shows the analytical and simulated results for three cases with E = 2, amin = 6. In the first case, shown in Table II (a), we used the parameters Nps = 25, bl = 7, and b = 7; and the second case, shown in Table II Table II (c), we used Nps = 4, bl = 7, and b = 7. From Table 11 , we can see that our predicted results Table II (c). This implies that roundoff noise-to-signal ratio is not sensitive to Nps. The insensitivity to Nps can also be directly predicted by (54), (57), (61), (62), and (7) by noting that the numerator and the denominator of (54) have the same order of Nps. On the other hand, the results in Table II(b) , which are much larger than those in Table II (c), imply that roundoff noise-to-signal ratio is very sensitive to the number of A / D bits. Note, from Table II(b) and (c), that the roundoff noise-to-signal ratio reduction is about 6 dB for a 2 bit increase in the A / D conversion. In other words, the roundoff noise-to-signal ratio reduction is about 3 dB per A / D bit. This will be considered further below.
For the set of cases shown in Table I11 we used Nps = 4, bl = b = 7. We show the analytical results for various values of 1 and amin. From this table, it is observed that the effects of roundoff noise are the same for constant E + amin. We can see that V increases as 1 increases. This follows intuition.
We have observed that the reduction of roundoff noiseto-signal is about 3 dB per A / D bit in Table II(b) and (c). We now consider additional analytical predictions for this case. The roundoff noise-to-signal ratios for bl = 4, 5, 6, 7, 8, and 9 are shown in Table IV. The 3 dB per A / D bit in the reduction of roundoff noise-to-signal ratio can be seen for the values of bl between 4-6. However, the reduction in the roundoff noise-to-signal ratio is different for cases bl = 6 , through 9. While the reduction in the roundoff noise-to-signal ratio between bl = 6 and bl = 7 is about 2.6 dB, the reduction in the roundoff noiseto-signal ratio between bl = 7 and bl = 8 is about 1.9 dB. The reduction in the roundoff noise-to-signal ratio between bl = 8 and bl = 9 is only 0.9 dB. The amount of the reduction in V per bit decreases as the number of the A I D bits ( bl ) increases. The roundoff noise generated by the A / D conversion dominates V for small bl. As the number of A / D bits increases, the A / D noise decreases until it no longer dominates V. 
VI. DISCUSSION AND CONCLUSIONS
In this paper, we have presented a roundoff noise analysis for DSPP's using Welch's power spectrum estimation based on reasonable assumptions for the signal and quantization error models. Overilow is assumed to be negligible in our analysis. Instead of providing an extremely complicated equation for computing the roundoff noise-to-signal ratio V in the measurement of signal power PR, we have derived a set of equations that can be used to compute I/.
As mentioned in Section 11, the practical implementation of a digital signal processor is driven not only by the roundoff noise level but also by the dynamic range at each stage of processing. The digital processor may incur some untractable nonlinear effect if overflow occurs. One can compute the mean and variance of the signal at each stage to compute the probability of occurrence of overflow. Then one can also determine the dynamic range needed in each stage to make this probability small. To prevent overflow, many bits at each stage of processing are preferred. Since the number of bits at each stage determines hardware complexity, discarding some least significant bits would make the hardware implementation more feasible. However, the value of V depends on the number of bits used in each stage. By computing V for a hardware design, one can observe if the performance of this design is satisfactory. From the simulation example presented in Section V which supports our analytical results, one can see that Vis more sensitive to certain parameters than others and that V may be the same for some combinations of parameters. The derived results have been used to minimize the hardware complexity of the DDP for NSCAT, which we will report in a separate paper.
Fixed-point arithmetic is assumed in this paper. The results for the floating-point arithmetic can be similarly obtained. We believe that the roundoff analysis presented in this paper will be helpful in designing other digital signal processors for estimating signal power.
APPENDIX A
COVARIANCES ASSOCIATED WITH ft'( k ) AND f $ i ) ( k )
In this appendix, we show that 
038)
for the signal-plus-noise case. The computation for roundoff noise-to-signal ratio I/ is trivial after E[h7 ] and where t is the number of bits truncated. Thus, mi = me and a: = a: for i = 1, 6, and 7. The mean and variance of e2 depend on the implementation of FFT. Assume that the FFT is implemented by decimation-in-frequency method with a bit-shifted to the right before FFT butterfly calculation. Then m2 = -0.5 and E [ e i + e;'] = 2.06 which are consistent with [7] . Therefore, 0 2 2 = 1.56 and u2 [see (U)] is 2%+' 1 u2 = --12 + 1.56.
ACKNOWLEDGMENT
We appreciate the valuable suggestions by one of the anonymous reviewers.
