Abstract-A novel Concurrent Error Detection (CED) technique for the complex Fast Fourier Transform (FFT) implemented in SRAM based Field Programmable Gate Arrays (FPGA) is presented in this paper. This technique compares one of the inputs to a linear combination of the outputs, avoiding complex multiplications. Hence its computational complexity and resource usage are lower than that of the Parseval Sum of Squares (SoS) or other CED techniques applied to this algorithm. In order to test its performance, bit-flips have been injected in the FPGA configuration bits. The technique achieves high error detection rates with fewer unnecessary reconfigurations than SoS, which makes it suitable for applications in which circuit size and availability are critical.
I. INTRODUCTION
I N recent years, the importance of soft errors in electronics has greatly increased. This is due to the progressive shrinking of the elements of these devices, which reduces the amount of charge needed to produce a failure while increasing the number of components exposed to their effects [1] . These errors are caused by Single Event Upsets (SEU) produced by impacts on the device from, among others, protons, heavy ions, energetic neutrons or alpha particles [2] .
SRAM based Field Programmable Gate Arrays (FPGA) are affected by this type of errors in a different way than that of traditional Application Specific Integrated Circuits (ASIC). Since these devices are programmable, their memories hold two kinds of bits: user bits and configuration bits. When an SEU occurs, it changes the value of one of these bits. When it affects a configuration bit, the mapped design changes permanently, which may cause malfunctions until the device is reconfigured [3] . Essential bits are those that, if upset, change the circuit design [4] . Therefore, only errors on these bits will affect the circuit performance.
Considering that most of the bits found in FPGA memories are configuration bits [3] , these errors are of high importance.
Manuscript received April 12, 2016 . This work is supported by the Spanish Ministry of Economy and Competitiveness under Grant ESP2014-54505-C2-1-R and by the Natural Science Funds of China (Grant No. 61501321).
R. González, P. Reviriego and J. A. Maestro are with Universidad Antonio de Nebrija, Madrid 28040, Spain (e-mail: rgonzalezt@nebrija.es; previrie@nebrija.es; jmaestro@nebrija.es).
Z. Gao is with the School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China (e-mail: zgao@tju.edu.cn).
Fig. 1. DMR technique
Once an error has been detected, the FPGA must be reconfigured in order to correct it. Because of this, Concurrent Error Detection (CED) techniques are a useful way to mitigate the effect of SEUs, whereas Error Correction Techniques are less interesting as the upset results in a permanent error that can only be removed through reconfiguration [5] . This also causes false error detections to be more important than in the case of ASICs since they result in unnecessary reconfigurations of the device. This leads to time and energy wasting, while also causing the device to be unavailable.
A simple and common approach to deal with SEUs is the introduction of redundancy in the design. When working with FPGA devices, Dual Modular Redundancy (DMR), which duplicates the design and compares both outputs, is an effective way to detect errors. Once detected, they can be corrected by reconfiguring the device (an example of this technique applied to an FFT module is shown in Fig. 1 ). This technique, however, results in a high cost in circuit area, since it requires at least 100% area overhead. Another way to mitigate the effects of SEUs is the use of circuit specific techniques that make use of the circuit properties to detect these errors [6] .
The Fast Fourier Transform (FFT) is an algorithm widely used in signal processing applications [7] . When trying to detect errors, the Parseval Sum of Squares (SoS) check is one of the most common approaches [6] . When used in ASICs, this technique detects most of the errors produced by SEUs. Many alternative techniques have been proposed to detect errors produced in FFT modules ([8] , [9] , [10] ), all of them with lower computational complexity than that of Parseval SoS, although requiring at least real multiplications.
However, since soft errors affect FPGAs in a different way, previous results regarding the performance of error protection 978-1-5090-4366-8/16/$31.00 ©2016 IEEE techniques in ASICs may not be applicable [3] . Hence the interest in testing techniques which may not be as effective in ASICs, but could be suitable for these devices.
In this paper, a new technique to detect errors produced in the FFT algorithm is presented and its performance is compared to that of the DMR and Parseval SoS approach. In order to test the effectiveness of these techniques, they have been programmed in an FPGA device and errors have been injected in their configuration bits. In Section II, a brief introduction to FFT algorithms is presented. Section III explains both the Parseval SoS and the proposed technique. The test on these techniques and its results are explained in Section IV. Conclusions are presented in Section V.
II. RADIX-4 FFT ALGORITHM
The algorithm used to transform a digital signal sampled in the time domain into its representation in frequency domain is known as Discrete Fourier Transform (DFT) [7] . Similarly, the Inverse Discrete Fourier Transform (IDFT) calculates the coefficients of the signal in the time domain from its representation in frequency domain.
For a sampled signal x(n), its N-point DFT, with n = 0, 1, 2 ... N − 1, is defined by the equations:
Whereas the IDFT of a frequency domain signal X(k) with k = 0, 1, 2 ... N − 1 is defined by:
An efficient way to calculate the DFT coefficients which reduces its computational complexity is the Fast Fourier Transform (FFT). This algorithm can be implemented in many ways, being radix-4 one of the most popular [11] . This technique can be applied to signals sampled with 4 m points, with m = 1, 2, 3 ... (although it can be used with shorter signals by filling with 0), and requires the initial sum of the DFT to be split into four sums:
Once the sum has been split, its four terms are expressed as follows:
Since X(4m), X(4m + 1), X(4m + 2) and X(4m + 3) are N 4 -point FFTs, this process may be repeated in order to obtain sixteen N 16 -point FFTs, and so on until only 4-point FFTs have to be calculated. This way, every frequency domain coefficient can be calculated using 3 4 N log 4 (N ) complex multiplications instead of the N 2 required by the standard DFT [11] . In this paper, a 1024-point radix-4 FFT based on a publicly available design found in [12] is being used for testing purposes.
III. ERROR DETECTION TECHNIQUES

A. Parseval SoS
The Parseval theorem states that the energy or sum of squares (SoS) of the inputs of the FFT algorithm are equal to the SoS of the outputs of the algorithm except for a scaling factor [13] . This property can be used as an error detection technique with smaller overhead than that required by DMR, comparing the energy of the inputs to that of the outputs in order to discern whether or not an error has occurred (see Fig. 2 ). Thus, taking x(n) = I in (n) + jQ in (n) and X(k) = I out (k) + jQ out (k) results in the expression:
B. Proposed Technique
In order to develop a new CED technique for the FFT algorithm that requires less computational complexity, another of its properties will be exploited. Noticing that each of the inputs relates to a sum of the outputs multiplied by the twiddle 
Sums
4N 2N
Multiplication 4N + 1 0 factors associated to that point (see equation 2), a relationship that may allow to detect errors can be established. Although the value of any point can be calculated using the Goertzel algorithm [13] , picking the right inputs for the comparison would reduce the number of required operations. Taking n = 0 and substituting in the IDFT algorithm (equation 2) would result in the equation
jπ , which establishes a relationship between the input N 2 and the alternate sum and subtraction of the outputs. These approaches, however, do not achieve high error detection rates because of errors resulting in positive or negative offsets in the outputs compensating with each other. After testing both of them, their error detection rate was revealed to be around 50% of the errors produced, and therefore they were discarded. Taking n = 
Which, substituting x(n) = I in (n) + jQ in (n) and X(k) = I out (k) + jQ out (k), results in the following expressions):
Thus, storing the N 4 th input and comparing its value to those obtained by applying these expressions to the outputs will be used as an error detection technique (see Fig. 3 ). This technique is similar to Parseval SoS in the fact that neither of them would be affected by errors compensating with each other (since N 4 uses both the real and the imaginary parts of the outputs, the error would have to affect both of them in the same way for this to happen). However, N 4 requires a smaller resource usage than that of Parseval SoS, since no multiplications and less sums are needed. Table I summarizes the number of operations required by each technique.
IV. TEST AND RESULTS
All three error detection techniques were implemented using a Xilinx Zynq-7000 All Programmable System on Chip (AP SoC), which integrates a dual-core ARM Cortex-A9 based processing system (PS) and programmable logic (PL) in a single device. A Soft Error Mitigation (SEM) IP module [14] was loaded with these circuits in order to emulate the presence of SEUs by injecting bit-flips in their configuration bits, correcting them after the error output was obtained to avoid error accumulation [5] . Errors have been injected in every essential bit that affects the area occupied by the FFT and CED circuits in order to obtain reliable data regarding the effect of SEUs.
Once the bit-flip has been injected, the circuit begins its operation and a predetermined set of inputs is introduced. Three different signals were used as inputs during the test: a negative slope linear function, a positive slope linear function and a linear combination of sine and cosine functions. Once the algorithm finishes, the outputs are compared against the golden to determine if an actual error has been produced, thus classifying the error output of the circuit in four different cases (see Fig. 4 ):
• Masked Error: No error was produced nor detected.
• False Negative: An error was produced but not detected.
• False Positive: An error was detected but not produced.
• Detected Error: An error was produced and detected. The three techniques have been implemented using Hardware Description Languages (HDL). Once loaded in the FPGA, the process is controlled by a PC through a serial port interface which allows both to send instructions to the SEM IP and to retrieve results once the process is finished (see Fig.  4 ). Table III lists the results obtained during the test. The amount of bit-flips required to characterize each technique depends on the number of essential bits found in the design, which is proportional to its resource usage. Hence the differences in the number of bit-flips performed. All three techniques have achieved low percentages of false negatives in the tests and thus can detect most errors. Table IV concentrates on the non masked errors and summarizes the relation between detected and undetected errors as well as between true positives (that is, error detections caused by actual errors in the outputs) and false positives. With a relatively small loss of detection capabilities, a great advantage in resource usage can be achieved. This suggests that in applications in which error detection approaching 100% is not needed the N 4 technique would be more suitable to protect an FFT module against soft errors with a small resource overhead. Requiring less resources results in the CED being more effective, since it means less essential bits that can be upset, thus leading to fewer malfunctions (see table III) .
The aforementioned undesirable consequences of false positives in FPGAs must also be taken into consideration. As seen in table III, both Parseval SoS and N 4 techniques present a higher number of false positives than that of DMR due to the relative sizes and complexities of their error detection processes, DMR being only a comparator. False positives result in an undesired reconfiguration of the entire FPGA, which means wasting programming time and energy. Hence, SoS having more than 45% of its error detections being false, would be less efficient than N 4 in terms of speed and energy saving (see table IV).
V. CONCLUSIONS
A novel Concurrent Error Detection technique for FFT implementations in SRAM-based FPGA devices has been proposed. It has been tested by injecting upsets in the FPGA configuration bits. This technique, with less resource overhead than that of the DMR or Parseval SoS approach, achieves high error detection rates, while being less susceptible to SEUs producing errors. Moreover, since the number of false positives produced with this technique is smaller than those of Parseval SoS, the number of unnecessary reconfigurations of the device will be lower, resulting in a smaller waste of time and energy.
These results prove that techniques that were uneffective as CED in ASICs may be effective when implemented in FPGAs. This is due to SEUs producing permanent changes in the circuit design, which results in errors that would otherwise be unnoticed propagating towards the outputs. It also highlights that the size of the CED is more important than in the case of ASICs since bigger circuits lead to a higher amount of false positives that require reconfigurations of the FPGA.
