Memory scrubbing is used to mitigate SEU on susceptible devices. In the case of FPGAs, configuration memory scrubbing is generally used in conjunction with triple modular redundancy (TMR) to increase the reliability of FPGA systems in space borne applications. Reported solutions require a subsystem able to read and write from the configuration memory and retrieve from a "safe storage" a golden bitstream for scrubbing whenever an error is detected. An alternative to this solution is to implement error detection and correction codes. These codes usually require data to be embedded within the data that is being protected, which is particularly difficult in the case of a configuration bitstream since it is automatically created by CAD tools. In this paper we present an alternative error correction and detection implementation that overcomes such difficulties and allows fast local scrubbing without the need of storing a golden bitstream somewhere safe. This implementation is a processing peripheral tied to a scrubber -labeled Femto-currently being implemented in a RHBD S-ASIC. 
I. Introduction
PGA-based reconfigurable systems' popularity for space-based applications has grown considerably due to their flexibility and the ability to multiplex in real time different hardware configurations. A wider utilization of COTS FPGAs -especially in aerospace applications -is limited by their susceptibility to Single Event Upsets (SEUs). .
The traditional approach for Single Event Upset (SEU) mitigation on commercial parts consists of triple modular redundancy (TMR). Although proven effective 1 , this method adds a certain amount of logic overhead and a penalty in power consumption and processing speed. As shown in Ref. 2 , this approach is also vulnerable to multiple bits upset that are becoming more frequent as geometries decrease in modern devices. An alternative approach -called "scrubbing" -relies on simply reloading the configuration memory frames at defined time intervals. This approach is possible in the case of FPGA devices that support partial reconfiguration, such as the Xilinx's Virtex II, Spartan 3, Virtex 2-Pro, Virtex 4 and Virtex 5. Scrubbing provides protection against the accumulation of upsets in the configuration memory and -in combination with TMR -improves the overall system's reliability. Different alternatives for the design of a scrubbing system can be devised by simply varying the configuration memory access port used, the scrubber position with respect to the system being scrubbed and the strategies used to monitor and scrub the configuration memory. The suitability of any of these design variations to a certain application is driven by factors such as mission criticality, hardware and power limitations. The ideal scrubbing solution would be extremely flexible to support its deployment on every possible scenario.
The scrubber labeled Femto 3 is a programmable state machine that supports a reduced set of 14-bit instructions that provide flexibility to program and execute different scrubbing strategies. The scrubber currently supports Virtex 4 devices, but its programmability permits its expansion to future devices with compatible configuration memory access ports. Femto's architecture is modular, which provides flexibility for its implementation and deployment in different hardware setups including legacy setups that weren't originally designed to support scrubbing. Femto's architecture is defined to allow different processing units to be attached to it as peripherals would. In this paper we present an implementation of an error detection and correction code for Femto. This implementation differs from previously reported solutions 4, 5 in the fact that it does not required the storage of a golden bitstream for scrubbing.
II. Background
The basic functionality of a scrubber is to access the configuration memory to overwrite it periodically with a golden copy of the original bitstream to correct any possible bit flip. This method is called "blind scrubbing". A more advanced feature is to read back the configuration memory and detect errors to selectively overwrite sections of the configuration memory instead of blindly scrubbing. Blind scrubbing is the most common implementation for scrubbers reported in the literature. Its main advantage is simplicity, since overwriting the configuration memory without discrimination is a relatively simple task. This method has two main drawbacks. First, blindly scrubbing has a significant overhead in terms of power consumption (writing the configuration memory consumes measurable power 6 and performance). The latter is possibly the most important. Since no discrimination is done as to what sections of the configuration memory are scrubbed, some bits that are upset may remain in the wrong state for long time while waiting for the scrubber to complete its turn around the whole configuration memory. In some instances, blind scrubbing will also prolong a system's down time while the memory space is scrubbed.
Reading back the configuration memory to detect errors for scrubbing is a more complex but efficient task. Selective scrubbing saves power and allows the designer to implement scrubbing strategies that provide higher reliability to critical sections of a system and lower reliability to less important sections. Additionally, reading back the configuration memory to find errors would provide valuable statistics for space borne applications.
In order to detect errors in the contents of any memory, some kind of parity bits or error correction and detection codes have to be embedded within the memory contents. In the case of Xilinx's FPGAs, SECDED (Hamming code) parity bits are used. Xilinx provides a primitive (FRAME_ECC) in some of their devices (Virtex 4 and Virtex 5) that examines the data read back from the configuration memory and uses the SECDED bits to signal and correct errors in the bitstream. This primitive significantly simplified the design of a scrubber with read back capabilities. Unfortunately Xilinx's support for this feature has been suspended.
As a result, the scrubber must now read back the configuration memory, calculate some kind of error detection code and compare it against a golden code calculated and stored previously. It is the need to initially calculate and store golden error correction code that significantly complicates the scrubber design. What is more, a "safe storage" must be provided to store golden copies of the bitstream for scrubbing. The need for this extra storage would be eliminated if the user is allowed to insert parity or error correction bits within the memory content, allowing for error detection and correction without the necessity of storing data locally in the scrubber.
Alternatively, one can devise an error correction and detection code that allows to reconstruct a corrupted bit based on multiple error correction bits, that do not need to be stored within the data being protected. This is the solution that we present in this paper.
III. Femto's architecture
A generic Femto's block diagram is shown in Figure 1 . It shows the basic architecture of a Femto-based system. Besides the interface for an instruction memory that holds Femto programming, a double-address bus interface is present to hook peripherals to Femto. A basic system is compose the Femto and at least two peripherals, one for calculating and storing error detection and correction codes and a second peripheral to interface with the memory being scrubbed. The double address bus feature allows Femto to arbiter data from one peripheral to the other simultaneously. JTAG, SelectMap and ICAP interfaces are currently available for Femto. To calculate error detection codes a CRC16 peripheral is currently available and a peripheral for binary BCH 7 error correcting described in this paper is currently being developed. 
IV. Error detection and correction
A representative example of what the scrubber operation would be is shown in Figure 2 . The most important operations in this diagram are error detection and error correction. The correction of an error once detected is an involved operation. The designer has basically two alternatives (also depicted in Figure 5 ). The first alternative is to use the redundant or parity bits of the error detection and correction code to infer the location of the error within a frame and correct it. The second alternative -in case the redundant bits used only allow error detection but not error correction -is to fetch a golden version of the frame's information from a safe storage and use it to overwrite the corrupted frame.
Fetching a golden frame for scrubbing has the disadvantage of requiring a local and safe (not susceptible to SEU) storage, increasing the overall system's complexity. However, it is possible to ease the complexity by using the storage present in FPGA's hardware platforms to store the initial bitstream. PROMs are generally use for this purpose.
Commonly used codes that allow error detection and correction, require the insertion of redundant bits within the information being protected. As mentioned earlier, parity bits for SEDDEC (Hamming code) are embedded by default within the bitstream generated for Xilinx FPGA devices. In theory, these redundant bits will allow a scrubber to detect up to two errors and correct up to one error within a frame. However, Xilinx has not released enough formal documentation for us to take advantage of this feature.
We introduce now an alternative solution that allows correcting the errors in a frame without accessing a storage for golden frames. This feature allows Femto to achieve very fast focused scrubbing with the benefit of speed and lower power. The proposed solution does add complexity and latency to the scrubber but the benefits in terms of fast per frame scrubbing, with no need to access the PROM or other storage, and the resultant lower power operation far out weigh these drawbacks.
In the discussion below we will use binary BCH [16] error correcting codes. However, the methodology does not preclude using non-binary symbol based codes such as Reed-Solomon codes. The BCH codes allow flexibility in designing the code to correct for a specified number of errors 't'. BCH codes are specified by the code size 'n', the number of bits that can be corrected 't' and the number of information bits 'k' as (n,k,t). The number of information bits 'k' is derived from the specification of 'n' and 't'. The code size 'n' is equal to 2 m -1 so that we actually specify 'm'. The fact that n=2 m -1 means that if the frame size is k' then k=n-t> k'. What this means is that for each frame 'j' of length k' when k' < k, we have to add pad bits (these could be zeroes or a random sequence of '1's and '0's) to the frame prior to encoding with the BCH coder to produce 'n' bits or in other words, (n-k) check bits. We will address this issue in later paragraphs.
With the above observation in mind, consider Figure 3 which shows the initial setup where for each golden frame both the BCH check bits and CRC are calculated and stored. Note that the size of the check bits and CRC are American Institute of Aeronautics and Astronautics 4 small compared to the frame size. In the Figure, we add the pad bits to the k' bit frame forming 'k' information bits. These are encoded into an 'n' bit BCH code. We select the (n-k) check bits and store them in the Check Bit register for frame 'j'. We also calculate the CRC for the k' bit frame and store it in the CRC register for frame 'j'. So for each protected frame we have a check bits register and CRC calculated based on the uncorrupted frame. Figure 2 . Simplified data flow diagram of the scrubber's operation, including the two alternatives described in this section to acquire the golden frame for scrubbing (in the Figure N is the current frame index and M is the maximum frame index). During operation where the frames are subjected to radiation, the scrubber reads the frames and corrects any errors and calculates the CRC of the corrected frame. If the CRC matches the reference CRC then the frame is replaced with the corrected frame (see Figure 4) . If the CRC does not match this means more than 't' bits were in error. In this case, the frame has to be updated from the PROM which entails a penalty in both power consumption and latency. By trading off complexity in the scrubber by using more error correcting capability (increasing 't') the probability of a mismatch between the corrected frame CRC and reference CRC is minimized. One feature of this approach is that in the event where the CRC's do not match, the event can be logged with the frame number. This data is invaluable for analyzing the impact of the radiation on the system. We have synthesized the BCH (1023, 983, 4) encoder and decoder using the ViASIC IBM RHBD 90nm 9LP process. The gate count for the encoder is 282 gates and for the decoder is 9572 gates. The number of check bits is 40 bits in this case.
The ViASIC Structured ASIC Fabric, based on the 90 nm IBM 9LP process, supports temporal latch technology [Mavis, Eaton] for the registers that store the pre-computed check bits and CRC. Thus preventing upsets in these registers in a radiation environment.
BCH error correcting has been suggested for memories to increase reliability 8 . Specific designs tailored to the number of expected errors can also be explored 9 . We can also explore non binary BCH codes that minimize the size of the pad bits and may also reduce complexity. Nevertheless, binary BCH codes provide great flexibility in specifying 't' the number of bits that can be corrected.
We should point out that in the unique application of BCH error correction to the scrubber, the system has differences with a communication link. In a communication link any of the 'n' bits of the code word can be hit by a bit error. In our proposal, only the k' bits are subject to a bit error. The check bits and the pad bits are not affected since the check bits could be stored in the Radiation Hardened scrubber chip. The pad bits are deterministic and are generated in the scrubber chip.
V. Conclusion
We present an alternative solution for the traditional approach of storing golden copies of a bitstream to scrub an FPGA's configuration memory. The error correction and detection mechanism we propose for this implementation is able to reconstruct a frame and has enough flexibility to account for the probability of multiple upsets in a single frame. A Femto scrubber equipped with this mechanism is currently being implemented in a radiation hardened S-ASIC 10 .
