Abstract-
I. INTRODUCTION
IELD programmable gate arrays (FPGAs) have gained great interest in recent years as an enabling technology providing designers with once unseen capability. The performance, capacity, and application of FPGAs have steadily increased over time to the point that the devices are not only capable of replacing the glue logic and management duties of ASICs, but can replace the functionality of entire processor boards. The trend of advanced reconfigurable blocks has evolved from simple programmable SRAM, to integrated hardware multipliers, to embedded hard core processors. Xilinx's Virtex-II Pro and Virtex-4 FPGAs are two such families of devices that provide designers with a Some of the work described herein was performed by the Jet Propulsion Laboratory, California Institute of Technology under contract with the National Aeronautics and Space Administration (NASA) with funding from the NASA Electronics Parts Program (NEPP) and NASA ESR&T.
G choice of one or two embedded PowerPC405 processors. These hard-core processors provide designers with a very powerful computing capability surrounded by programmable, reconfigurable logic gates to provide a virtually limitless array of interfaces to the rest of the spacecraft.
Commercial scaling trends continue to drive down feature size and reduce core voltage. Previous work [1] has shown that such sizing trends have reduced the static, saturated crosssections of the processor registers and cache. This paper presents results from two generations of PowerPC processors.
Much work has been done into defining testing the inherently complicated task of testing processors [2] - [4] . In the case of the Xilinx devices, processor testing is further complicated as the processors are embedded in upsettable, programmable logic. This programmable logic is both the device's greatest strength and Achilles ' heel if not properly mitigated. Proper mitigation techniques such as Triple Modular Redundancy (TMR) and partial reconfiguration must be correctly implemented in order to take advantage of the processing power [5] - [6] .
If properly mitigated, the surrounding FPGA fabric provides a means to extract a great deal of processor information quickly and efficiently. After proper mitigation, a system error can be probabilistically reduced to the domain of the processors, which cannot be TMR'ed.
Most microprocessor testing falls into either the domain of static or dynamic. A test falls into one of these categories based upon the level of processor activity. Static testing is done with minimal or no processor activity. At an abstract level, static testing of a processor is done by writing to available registers and cache, irradiating while inactive (unclocked), and reading back the data after irradiation. Conversely, dynamic testing is done while clocking the processor, preferably at full speed, with known inputs while recording expected outputs during irradiation, usually using processor benchmarking software to evaluate overall performance and device characteristics. Although specific upsets cannot be segregated and counted during dynamic testing, output behaviors seen during irradiation are counted and categorized (e.g. "hangs"). Neither method of static or generic dynamic testing will produce accurate space error rate predictions.
Static testing will produce a worst-case, conservative bound, which is defined in this paper. Dynamic Upset Characterization and Test Methodology of the PowerPC405 Hard-Core Processor Embedded in Xilinx Field Programmable Gate Arrays Gregory R. Allen, Member, IEEE, Gary M. Swift, Member, IEEE, and Greg Miller F testing provides insight into the level of conservatism, but will be completely application specific.
II. DEVICE TESTED

A. Device Description
All Virtex-II Pro and Virtex-4 testing was performed using the Mil/Aero-grade devices, specifically the XQR2VP40-FF1152 and XQR4VFX60-FF1148, respectively. Both devices are packaged in a BGA package which uses a flip-chip geometry, in which the device circuitry is upside-down, facing the packages' ball contacts. When irradiating with protons, this does not cause great difficulty. Due to the limited range of the majority of heavy ion beams used at accelerators from Texas A&M and Lawrence Berkeley National Laboratory, we are required to remove the package lid and thin the substrate down to less than 100 µm.
In addition to a myriad of user configurable features (described below), both devices contain embedded IBM PowerPC 405's, a 32-bit, integer-only microcontroller cousin to the PPC750 with which it shares its instruction set.
1) Virtex-II Pro
The 2VP40 device from the Xilinx Virtex-II Pro family includes two 300 MHz-capable IBM PPC405 processors. The device is fabricated in a 130 nm CMOS process with commercial devices on bulk CMOS or a thick epitaxial layer and the Mil/Aero devices on thin-epitaxial CMOS. The device has an operating core voltage of 1.5V. In addition to the pair of PPC cores and almost 5,000 configurable logic blocks, this FPGA has the following features available for inclusion in a design: 3.5Mb of user RAM, 192 hardware multipliers, 804 I/O's with 12 so-called RocketIO Transceivers capable of speeds up to 3 Gb/s, and 8 digital clock manager blocks.
2) Virtex-4 FX The V4FX60 device from the Xilinx Virtex-4 family is fabricated in a 90 nm CMOS process, with an internal core voltage of 1.2V. The FX sub-family of Virtex-4 includes up to two IBM PowerPC405 processors specified to run at speeds up to 450 MHz. Commercial devices are fabricated on a thick epitaxial layer or on bulk CMOS with the Mil/Aero devices on a thin epitaxial CMOS layer. The user resources of this device are comparable to that of the 2VP40, including: approximately 4700 configuration logic blocks, 128 DSP slices, 4.2Mb of user RAM, 576 user I/O's, 12 digital clock managers.
B. Storage Elements
An inventory of the upsetable storage elements and proposed mitigation techniques are shown in Table 1 . The vast majority are outside the processor cores in a) the configuration cells, b) design-level memory, or c) design-level flip-flops. Mitigation in the form of a combination of triplication of design functions and storage plus active configuration scrubbing can potentially make an application very robust in spite of configuration SRAM upsets. Thus, an upset in one of the bits inside the processor is more likely to cause a system error or malfunction. Consider, for example, an upset in the program counter; it is clear that, although relatively few, upsets in processor bits are much more important. 
III. Test Methodology
Testing for microprocessor upsets and calculating space error rates for arbitrary flight software is not straightforward; thus, there have been many different test methodologies utilized or proposed [2] . At present, microprocessor testing mostly falls into two categories, dubbed static and dynamic. Static testing treats the processor like a memory device (albeit accessed with some difficulty); that is, a pattern is loaded and, after irradiation, inspected for upsets. In contrast, during dynamic testing, a benchmark program with fixed inputs and known outputs is executed repeatedly and the results of each iteration are categorized with regard to correctness and other behaviors, for example, not completing. Neither set of results translates tractably and correctly into in-flight error rate predictions, but static results yield a conservative upper bound while dynamic results give an idea of the degree of that conservatism, at least, for an artificial software instance or instances.
The static results presented here are really pseudo-static meaning clocked at full speed, but with processor activity minimized. The processor is in a very tight infinite loop more than 99.9% of the irradiation time. A tiny fraction of the time is spent storing snapshots of the registers in a strip chart in external memory; additional details of this test method, a.k.a. "do little," can be found in [4] . The reason for pseudo-static testing (as opposed to strictly static) is to provide visibility into the evolution of the registers (e.g. a processor event may occur due to an upset in the internal state machine that writes to one or several registers invalidating the upset data). Moreover, a device event such as a Single Event Functional Interrupt (see [7] - [8] for full definitions and rates) may cause a loss of communication to the FPGA, or potentially force a system reconfiguration. Events such as this emphasize the need to periodically strip chart the registers in order to gain confidence in the register data. The focus during static testing is usually on a particular portion of the storage elements within the processor, as opposed to the system as a whole. Although the processor is a hard core, it is surrounded by soft/configurable Intellectual Property (IP) that is necessary to ensure correct functionality of the processor. Some of these components include: a reset block, digital clock manager (DCM), processor local bus (PLB), general purpose I/O, and processor JTAG. Certain IP, such as the reset block and DCM can be removed from the device under test (DUT) and placed into a service FPGA that exercises the DUT. IP that has timing related constraints such as the PLB must remain in the DUT and be triplicated and scrubbed. However it is accomplished, a general rule is that any soft IP should either be removed or mitigated. Isolating and mitigating the system as much as possible reduces system errors and increases the efficiency and reliability of the desired data.
IV. Static Test Results
Heavy ion results for Virtex-II Pro are shown in Fig. 1 and  2 while proton results are shown in Fig. 3 and 4 , for the general purpose registers (GPRs) and the data cache respectively. Note that the statistical error bars for the cache data are much smaller than for the registers, almost as small as the plotting symbols; this is the result of over an order of magnitude more upsets in the cache data set. Also it is interesting that, for the cache, the susceptibility of cells storing ones is indistinguishable from that of cells storing zeros, but register bits storing ones are almost an order of magnitude more susceptible than they are when storing zeros. Qualitatively, these observations are in agreement with previous work on the IBM PPC750FX [1] .
Several models for converting heavy ion data to predicted proton data were considered and the best results compared to the actual proton results similar to the work in [9] . As seen in Fig. 5 , the PROFIT model [10] and Edmonds model [11] both underestimate the actual proton results for the data cache. While the PROFIT model has two adjustable parameters helping the fit, the Edmonds model has none and very rarely under predicts -these data are exceptional. In fact, a similar comparison for the GPR results, shown in Fig 5, yields closer agreement and the model is somewhat conservative when compared to the actual data.
Heavy ion results for Virtex-4 FX general purpose registers are shown in Fig. 7 . Note that the saturated cross sections and LET thresholds remained nearly identical to the Virtex-II Pro GPR results. It is also noteworthy that the asymmetry seen in the susceptibility of storing ones and zeros in the GPRs has flipped, i.e. zeros are more susceptible than ones. The bit susceptibility appears identical regardless of whether ones or zeros are stored. Note that the main contribution to the error bars is uncertainty in the beam fluence measurement. Fig. 3 . Proton results for the general purpose registers (GPRs). Points are offset slightly in energy for readability. Counting statistic error bars are shown (approximately two sigma); note that for observations without any GPR upsets (i.e., the three lower energies when zeros were the contents), only the tops of the error bars can be shown. 
V. CONCLUSION
The presented pseudo-static results on PPC405 hard-core processor(s) embedded in the Virtex-II Pro and Virtex-4 FX families of Xilinx FPGAs show consistency with models and earlier data on other PPC technology nodes. An upper bound on the error rates in geosynchronous orbit can be calculated assuming all the register and cache bits are utilized with a duty cycle of 100% and yields approximately two register upset per PPC405-year and two cache upset per month. Because reloading and rebooting should take at most a few seconds, this rate is low enough that, at least for non-critical flight applications, the PPC405 core's viability is clear -upsets are unlikely to be a significant operational intrusion.
In combination with higher level mitigation schemes like running two instances in lockstep [14] - [16] , even most critical applications may be robust enough as well.
