Each mission has had varying degrees of data processing complexities, performance requirements, and external interfaces. We will show the methodology used to minimize the changes required to the physical hardware, FPGA designs, embedded software interfaces, and testing. This paper will summarize significant results as they apply to each mission application. In the HST-SM4 application we utilized the FPGA resources to accelerate portions of the image processing algorithms more than 25 times faster than a standard space processor in order to meet computational speed requirements. For the ISS radiation on-orbit demonstration, the main goal is to show that we can rely on the commercial show how we quickly reconfigured the SpaceCube system to meet the more stringent reliability requirements.
The
SpaceCube system is suitable for most mission applications, particularly those that are computationally and data intensive such as instrument science data processing. We will show how the SpaceCube hybrid processing architecture is used to meet data processing performance requirements that traditional flight processors cannot meet. Each mission has had varying degrees of data processing complexities, performance requirements, and external interfaces. We will show the methodology used to minimize the changes required to the physical hardware, FPGA designs, embedded software interfaces, and testing. This paper will summarize significant results as they apply to each mission application. In the HST-SM4 application we utilized the FPGA resources to accelerate portions of the image processing algorithms more than 25 times faster than a standard space processor in order to meet computational speed requirements. For the ISS radiation on-orbit demonstration, the main goal is to show that we can rely on the commercial
FPGAs and processors in a space environment. We describe our FPGA and processor radiation mitigation strategies that have resulted in our eight PowerPCs being available and error free for more than 99.99% of the time over the period of four years. This positive data and proven reliability of the SpaceCube on ISS resulted in the Department of Defense (DoD) selecting SpaceCube, which is replacing an older and u.s. Government work not protected by U.S. copyright slower computer currently used on ISS, as the main avionics for two upcoming ISS experiment campaigns. This paper will show how we quickly reconfigured the SpaceCube system to meet the more stringent reliability requirements. SpaceCube is a family of Field Programmable Gate Array (FPGA) based on-board science data processing systems developed at the NASA Goddard Space Flight Center (GSFC) [1] . The goal of the SpaceCube program is to provide lOx to lOOx improvements in on-board computing power while lowering relative power consumption and cost. SpaceCube is based on the Xilinx Virtex family of FPGAs, which include processor, FPGA and digital signal processing (DSP) resources. These processing elements are leveraged to produce a hybrid science data processing platform that accelerates the execution of science data processing algorithms by distributing computational functions among the elements. This approach enables the implementation of complex on-board functions that were previously limited to ground based systems, such as on board product generation, data reduction, calibration, classification, event/feature detection, data mining and real time autonomous operations. The system is fully reconfigurable in flight, including data parameters, software and FPGA configuration, through either ground commanding file transfers or autonomously in response to detected events/features in the instrument data stream.
Background
The SpaceCube processing system was started at GSFC in 2006 with Internal Research and Development (IRAD) program funding [2] .
A series of internal prototype demonstrations to NASA officials showcased the computational power and its inherent reconfigurable advantages over typical space processors. NASA recognized the clear potential of the technology, and provided the funding needed to increase the technology readiness level (TRL) for space flight applications. Specifically, the Hubble Space Telescope Servicing Mission 4 management team infused SpaceCube as the main avionics for an experimental payload called Relative Navigation Sensors (RNS) [3] . The use of SpaceCube within the RNS system will be described in detail later in this paper.
The version of the SpaceCube that was initially developed in the 2006-2009 timeframe is known as SpaceCube v1.0. Follow-on versions have been developed [1] ; however the design and use of Space Cube v1.0 will be the focus of this paper.
HYBRID FLIGHT COMPUTING
There is a growing need for higher performance processing systems for space.
Instrument precision and speed capabilities are rapidly evolving which levies tougher electrical interfacing and data bandwidth requirements on the computing node of the system. In addition, on-board processing of the data products, in some cases in real-time, is now a common requirement.
On-board processing improves system efficiency and functionality in two areas. First, by allowing the spacecraft to preprocess data products on board, a smaller or compressed data volume per data set can be sent to ground, which increases the amount of time an instrument can be turned on and collecting data. It is typical for high data rate science instruments to constrain their data collection to 10-20% of the mission time to fit within the limited downlink bandwidth. This problem continues to grow as instrument capabilities increase. Second, it enables for applications on board the spacecraft to make autonomous decisions on the processed data products. This ability opens up a much more challenging range of mission objectives that can be targeted for space applications.
Typical space processing systems generally consist of a single radiation hardened processor such as the BAE RAD750, Aeroflex LEON3FT, BroadReach BRE440, or General Dynamics ColdFire which deliver less than 300 DMIPS. These standard processing systems are very good at providing general services such as Command and Data Handling (C&DH), Guidance and Navigation Control (G&NC), and simple instrument control. These processing systems are not good candidates for applications that require implementing fast computations of complex algorithms on a high bandwidth or large volume data source.
Another common component found in typical space processing systems is the anti-fuse FPGA, which generally have very good radiation immunity. The corresponding circuit board and FPGA architectures are designed for a set of very specific mission requirements. However, these 2 architectures are very hard to design and intrinsically expensive to change such that they are portable to multiple missions, dynamic functional requirements, or new post launch mission objectives or corrections.
A new approach is needed to meet the increasing challenges required by space processing systems. A hybrid computing system that combines multiple processors, reconfigurable FPGAs, flexible interface options, with a modular architecture is the solution that will bridge the gap between today's avionics requirements and yesterday's typical stand alone sequential processing architecture.
A hybrid computing architecture is able to retain the function of a multi-purpose computer that runs typical C&DH and G&NC. However, in addition to these types of tasks, it has the advantage of supporting computationally complex tasks that require FPGA co-processors to handle math such as FFT, matrix manipulation, parallel floating point operations, or implementing an advanced interface such as CameraL ink, Spacewire, gigabit Ethernet, or support the implementation of a custom interface.
The modularity of such a system allows for the quick adaptation to changing avionics requirements. A modular system, for example, can support adding a bulk memory card, a custom electrical interface, or expand the VO bandwidth required. A modular and reconfigurable system yields a high probability of using the same basic avionics package for different mission applications, or follow-on missions, even if interface and computing requirements are drastically different.
SpaceCube fits the need of a hybrid, reconfigurable, modular space processing system. This paper will show how cost and schedule can be reduced by reusing the same basic system for new missions.
Reuse of hardware architecture greatly reduces the amount of up front Non Recurring Engineering (NRE) costs and time associated with building a new system with new requirements from the ground up.
SPACE CUBE v1.0 DESCRIPTION
The SpaceCube v1.0 system is a compact, modular, low power, reconfigurable multiprocessing platform for space flight applications demanding extreme computational capabilities and/or high data rate/volume interfaces. The SpaceCube v1.0 processing system is based on the Xilinx Virtex-4 FX60 FPGA that includes two embedded hard IP PowerPC405 processors.
This specific FPGA was the subject of radiation testing and characterization by many groups including, but not limited to the Xilinx Radiation Test Consortium and the GSFC Radiation Effects and Analysis Group [10] [11] [12] [13] [14] [15] [16] .
The SpaceCube design leverages this work to properly mitigate radiation effects within the system, as will be discussed later in this paper.
A. Mo dular Stacking Architecture
The SpaceCube v1.0 mechanical design uses a custom stacking architecture. The system is comprised of various slices that are stacked together using a 122-pin connector from IEH Corporation [7] . The system uses a dual redundant I2C bus for low data rate transfers between all cards in the stack. Each card is given a unique address on the bus. The base system requires a power slice and a processor slice. This architecture allows for adding mission unique cards, if necessary. Four rods are used to hold the box together once all slices have been mated together. Figure 1 depicts the SpaceCube v1.0 modular architecture. This version of the system required five slices (2 power, 2 processor, 1 VO). Figure 2 shows the picture of the flight box that corresponds to the model in Figure 1 . This configuration of the system is 7.5-lbs. and is 5-in x 5-in x 7-in in size [9] . Each circuit board within the system is 4-in. x 4-in in size. A mechanical tray holds the card in place and allows the stacking connector to protrude through the bottom of the slice. The card edges are bolted down to its respective slice enclosure. In addition to the structural mount, the card edge is also the thermal interface for each card. Figure 3 shows the flight processor card slice enclosure.
3 
B. Power Slice Design
The power slice consists of two circuit boards. The Low Voltage Power Card (L VPC) has the typical EMI filter and DC/DC components found in space flight power supplies. The LVPC will accept 28V +/-8V and provide 5.0V, 3.3V, 2.5V, 1.5V, and +/-12V to the stacking connector. On power-up, 2.5V, 3.3V, and 5.0V are automatically turned on to support the main controller circuitry on the processor card. The L VPC supports switched services for 1.5V, 2.5V, 3.3V and +/-12V. The main controller on the processor card switches on these services by commands to support the Xilinx FPGAs. The processor card has a custom point-of load circuit that regulates the 1.2V required for the core voltage of the Xilinx devices. The DCC supports various functions including collecting local voltage and temperature data, SpaceCube lOBASE-T Ethernet and 1553 interfaces, controlling processor reset and power loss warning signals, and switching load services on the L VPC. An Aeroflex FPGA is used to control these functions and to communicate with the processor card via the I2C bus.
The L VPC and DCC cards are stacked together inside of one enclosure slice. The L VPC assembly requires a heat sink to handle the power loss within the DC/DC bricks. Figure 4 shows the assembly of the power slice. On the left, the DCC sits at the bottom of the enclosure. Two side rails are installed above the DeC which are seen along the edges of the chassis. Next, the L vpe is mated to the DeC board and bolted to the side rails. The L vpe is shown on the right with the heat sink installed to its circuit card assembly.
C. Processor Card Design
The processor card features two Xilinx Virtex-4 FX60 devices in a back-to-back fashion. Figure 5 shows Aeroflex Service Design-Two back-to-back Aeroflex FPGAs control the power sequencing of the switched power rails via I2e commands, reset control, watchdog timers, mISSIOn elapsed timer, scratch-pad ram, Xilinx configuration, non-volatile storage access, and monitor health and status. To provide all of these services as well as facilitate reconfiguration and reuse of the one-time programmable (OTP) Aeroflex FPGA an embedded 8-bit soft-core microcontroller (SpaceRISC) was designed and used as an alternative to using a complex state machine. This design decision has proven useful in not only enhancing the services provided but also allowed for debug and test code to be loaded into the OTP FPGA to better facilitate initial board testing as well as system integration and test activities.
The SpaceRISe is based on a standard commercial device that can only address 16KB of the 32KB SRAM. We turned this limitation into a benefit by developing a memory 4 controller that could conditionally operate out of the top or bottom half of the memory while simultaneously providing a side channel for read/write access to the 'inactive' portion of ram. This facilitates a complete reconfiguration of the microcontroller while the current flight application is still running.
The first stage boot loader (FSB) for the SpaceRISe is stored in an on board radiation hardened 16KB PROM. The SpaceRISe cannot execute code directly out of the PROM so a hardware boot-loader IP core was created to copy data from the PROM to the external SRAM before bringing the SpaceRISe out of reset.
The SpaceRISe FSB will then search for the latest Run-Time Application (RT-App) to load and execute. The RT-App is stored in a Quad-Triple Module Redundant manner; should the FSB not successfully load the RT-App it will fall back to the previous version. The RT-App is fully capable of performing this boot loader sequence from ground commands or alternative configuration files which allows multiple variants of the RT-App to be stored in flash while still preserving the 'Gold' boot configuration.
As the RT-App starts to execute it will first check to see if the startup was due to a watchdog timer reset or a clean power up. In the event of a watchdog timeout (WDT) the RT-App will check the configuration table for a set of flags to determine the next course of action. The current flight configuration allows for a programmable threshold of WDT and reverts to the 'Gold' application code should it exceed the threshold. After a nominal proceed condition is met the RT-App needs to enable the Xilinx FPGA's by turning on the switched power rails and configuring the FPGA. The bitstreams used for configuration are determined by a configuration file that is stored in flash. The RT-App then brings the PowerPC (PPC) processors out of reset and the PPCs start to execute their First Stage Bootloaders (PPC FSB). The PPC FSB will then request the second stage bootloader from the SpaceRISC; we have chosen to use UBoot as our second stage bootloader. UBoot will then request a boot script from the SpaceRISC that contains commands and the file addresses required to load the flight operating system and applications. The files are read from the SpaceRISC with a series of 'get file info' commands and 'flash read requests'. The 'get file info' commands take in a file ID that is translated to a flash address by the SpaceRISC. The SpaceRISC then reads the file headers and send it back to the PPc. This header contains information about the file address in flash, data CRC length and if the image is mirrored across multiple devices. The PPC and SpaceRISC then perform a series of 5 flash read request and response packets until all of the data is transmitted and UBoot can load the OS/Application. The simplified boot sequence is shown in Figure 7 .
Flash File Mitigation-NAND Flash technology is known to be susceptible to radiation Single Event Effects (SEE) including Single Event Upsets (SEU) and Single Event Function Interru pts (SEFI). Each processor card flash module is composed of four independent dies inside. The SpaceRISC NAND Flash Controller has the capability of performing mirrored read and write operations, which store the same file in one or more die. In addition, software in the SpaceRISC has the capability of adding Triple Modular Redundant (TMR) duplication of each file within each die.
For the most mission critical data such, as the SpaceRISC configuration tables and software, we utilized Quad redundant with byte level TMR (QTMR).
As the SpaceRISC reads a file it will read in three bytes at a time and perform a series of bit-wise AND/OR operations on the data set (1). The output of this operation is then byte-wise ANDed to the input data set (2) . To mitigate the possibility of two bit flips in the same bit position resulting in a false positive output, the voted results are compared to the input values and if any of the values do not match the voted output the system moves onto the next mirrored copy of the file at the current file offset. In the event that all copies indicate some kind of error we will use the voted output from the last test. This coupled with four checksums per page and checksums on all files helps to detect multiple bit flips in the NAND Flash before they are used by the system.
Due to that fact that QTMR is partially implemented in software it would require extending the boot time to utilize this method for larger files such as those for Xilinx bitstreams and the PowerPC OS and ramdisks. To minimize our boot time and thus increase our availability larger images are store in a Quad Mirrored fashion with DMA read and transmit assist. When a flash read request if received by the SpaceRISC the request is validated and a NACK packet is sent in response if any errors are detected otherwise the SpaceRISC will setup a flash read response and transmit the packet header information, the hardware in the Aeroflex FPGA will read data from the flash and place it directly into the transmit buffer while also calculating the data checksum which will be added to the header checksum to allow for general validation of the response packet.
The data transmitted also include the Out Of Bounds area of the NAND flash that is used to store error correcting codes for the NAND Flash page. Software in the PowerPC will then check that the page is valid or request a new page from the next mirrored copy.
Xilinx Configuration Scrubbing-The task of monitoring the programmable configuration bits within the Xilinx FPGA is typically handled by an outside controller. The Aeroflex FPGA designs and processing load on the SpaceRISC were considered to be at full capacity. The SpaceCube vl.O Xilinx FPGAs contain an internal TMR self configuration scrubber that utilizes the ICAP and FRAME_ECC [12] . The Aeroflex FPGA is responsible for enabling this service. The scrubber core reports status to the Aeroflex FPGAs that it is actively scrubbing, if it has detected and corrected an SEU, or if it has found an uncorrectable error as a result of a Multiple Bit Upset.
D. FPGA Design and Software Design Me thodology
FPGA development for the SpaceCube Xilinx FPGAs requires the standard Xilinx tool chains (ISE, EDK). We have developed a baseline FPGA design that includes the necessary framework for an embedded system using the PowerPCs on the SpaceCube. This baseline system is given to developers as a starting point for porting a new application to the SpaceCube environment. Similar FPGA designs have also been developed for the ML403 and ML410 Xilinx development boards. This allows for a cheaper development cycle for application engineers prior to targeting the SpaceCube system.
The PowerPCs within the Xilinx FPGAs on the SpaceCube currently support standalone code, Linux, VxWorks, and RTEMs operating systems (OS). The SpaceCube software team has modified an existing Linux OS and fine-tuned it to support the SpaceCube build environment (SpaceCube Linux).
The SpaceCube system is easy for application engineers to target and allows for a fast development cycle. We have supported more than 10 projects inside and outside of GSFC using this development approach. All cases have resulted in a seamless application port to the SpaceCube hardware system.
MISSION USE CASES
This section will present four examples of how the SpaceCube v1.0 system was adapted to support different missions. For each mission, we will describe the mission and its objectives, the corresponding SpaceCube hardware requirements and changes, FPGA and application descriptions, integration and testing, operations, and an assessment on overall development effort.
A. Relative Nav igation Sensors
On May 11, 2009, STS-125 Space Shuttle Atlantis, lifted off from Kennedy Space Center (KSC) with new instruments, gyroscopes, and flight computers for the Hubble Space Telescope. The HST Servicing Mission 4 (SM4) saw almost 37 hours of astronaut Extra-Vehicular Activity (EVA) time to install the instruments and hardware, and overcome many obstacles in servicing the observatory. Along for the ride on this mission, installed in the back of the shuttle payload bay on the Multi-Use Logistics Equipment (MULE) carrier, was a technology flight 6 experiment called the Relative Navigation Sensors (RNS) system [1, 3, 4, 6, 9] .
The RNS system, which was a driving technology requirement for the HST Robotic Servicing and De-orbit Mission (HRSDM), consists of three cameras, a GPS module, two redundant Mass Storage Modules (MSM) that each contains four hard drives, a Telemetry Module (TM), a SpaceCube v1.0 system as the payload's central avionics and a dual redundant ground terminal. The two main objectives of RNS were to record imagery of HST during rendezvous and deploy operations, and also to demonstrate the capability of providing real-time tracking and position estimation on HST with the SpaceCube processing system. The RNS flight hardware is pictured in Figure 8 . The VIM was responsible for compressing images. The compressed images were stored during critical operations and sent to ground operators via the processor card 2 shuttle KU link. This SpaceCube, pictured in Figure 2 prior to RNS payload integration, was approximately S-in. x S-in. x 7-in. in size and required a nominal power of 37W (7-8W per processor card). A high level SpaceCube diagram is shown in Figure 9 .
FPGA Design-Three of the four Xilinx FPGAs in the SpaceCube were 60-70% utilized. The fourth FPGA was for design contingency, but was never needed and remained unprogrammed. Two FPGA designs used one PowerPC and the ULTOR application FPGA design used both PowerPCs. The FPGA designs consisted of the required embedded system peripherals, internal card-to-card infrastructure, RNS interface peripherals such as the custom camera core, an internal triplicated self scrubbing configuration module, and hardware acceleration co-processing cores.
A major part of the RNS experiment on HST-SM4 was the GNFIR pose estimation application. One of the two Xilinx FPGAs in the processor card 1 hosted the GNFIR application.
In order to meet real-time processing requirements, the GNFIR system had to operate at 3 Hz. Initially, GNFIR was developed exclusively in software and run on the embedded PowerPC40S processor in the Xilinx. However, the performance using the processor alone was insufficient and GNFIR could only operate at 0.12SHz. In order to improve the application performance, the Floating Point Unit (FPU) FPGA IP core was added to the PowerPC, which resulted in a 4x speedup to O.SHz. We developed the custom Edge core in FPGA to accelerate some of the more compute intensive operations in GNFIR that resulted in an additional 6x speedup that enabled the application to operate at the required 3Hz [4] . The Edge core provides an FPGA implementation of the edge detection, gradient direction computations, and centroid computation on the camera image data. The edge detection is performed using a Sobel operator and computes the gradient vector at each image pixel by performing a convolution of two 3x3 filter kernels in the horizontal and vertical dimensions with the image. The gradient magnitude is then computed, and the edge data is scaled by a factor selected via a command register. The gradient direction is calculated using a CORDIC arctangent module. The centroid of each input image was also computed. Each pixel of the input image that is above the threshold in intensity is considered a significant pixel, and the centroid is the average coordinate location of all the significant pixels in the image. The desired threshold value was configurable through a command register by the software driver. The edge core processing engine is fully pipelined and can produce an edge/angle pair at the same rate as the camera data pixels are supplied to it. This data was buffered in a read FIFO in the FPGA core that was connected to the PowerPC's Processor Local Bus (PLB). This allowed the processor to transfer the data to memory at a high rate using Direct Memory Access (DMA). The reconfigurable nature of the radiation tolerant Xilinx FPGAs 7 in the SpaceCube v1.0 allowed the new Edge core to be developed, integrated, and tested in a matter of months. Figure 10 shows the high-level diagram of the embedded system design used to implement the GNFIR application. GNFIR ran under the SpaceCube Linux OS. The ULTOR application FPGA was proprietary, designed by Advanced Optical Systems. The UL TOR and GNFIR applications continuously exchanged HST position estimate data. This was implemented to help speed up the process of acquiring and locking onto the image in order to enter tracking mode. The ULTOR application PowerPC ran under the VxWorks OS [4] .
The C&DH FPGA design included an AGC algorithm with supporting FPGA core to dynamically adjust the camera brightness of the image as lighting conditions changed. A custom interface core was necessary to extract streaming data from the GPS receiver. The C&DH design also included a Floating-Point Unit (FPU) core, UARTs for communication with the MSMs and TM, and a KU core to stream continuous data to the shuttle's KU transponder system. The C&DH PowerPC ran under the SpaceCube Linux OS.
RNS Testing-Preparing the RNS payload for flight was a considerable task due to the number of instruments, interfaces, and configurable operation modes, along with the challenge of obtaining high confidence that the system would track a school bus sized satellite in the space environment. This involved a series of test campaigns at four NASA centers. Three trips with were made to the Flight Robotics Laboratory (FRL) at the Marshall Space Flight Center (MSFC). Testing at MSFC involved a full RNS engineering-level system and a full-scale aft bulkhead and a tenth-scale mockup of HST. FuJI system integration and operation capabilities including image recording, position estimation and tracking, and AGC were incrementally tested during subsequent trips. As issues arose during testing at the FRL, full advantage was taken of the SpaceCube's reconfigurability to fix problems quickly. The command and telemetry capability and KU downlink were tested at Houston's Johnson Space Center (JSC). Numerous tests at JSC included shuttle interface testing, ground terminal verification, and mISSIOn operations simulations with the entire shuttle ops team. All typical payload integration and environmental testing of the flight system was conducted at GSFC. The most notable test at GSFC took place prior to delivery to Cape Canaveral. With the RNS flight payload integrated onto the MULE carrier, a crane was used to maneuver the full-scale HST bulkhead to test the close proximity rendezvous and deploy sequences ( Figure 11 ). Final testing and shuttle integration took place at KSC. Two final software updates to the SpaceCube were conducted that fixed minor bugs that were found during ongoing testing at GSFC on the RNS engineering development units. To support all of the different tests, two SpaceCube EDUs in addition to the SpaceCube flight box were built for the project. The reconfigurability of the SpaceCube FPGAs and software were absolutely necessary in addressing the many issues that arose during application development, interface integration, and operation sequence testing. RNS would have missed schedule deadlines if the avionics did not have the ability to quickly adapt to required changes. However, after environmental testing, the FPGA designs were locked down.
Operations-Payload operations were conducted by the RNS team from the Payload Operations Control Center within Houston's Mission Control facility. RNS was successful in achieving all of its on-orbit objectives. The GNFIR position and attitude estimation algorithm successfully tracked HST for 21 minutes during rendezvous using the long range camera between ranges of 50 to 100 meters, and also tracked HST during deploy for 16 minutes using the short range camera at a 2 to 3 meter range [4] . RNS recorded a total of 6+ hours of HST imagery (�750GB data) during rendezvous and deploy. GNFIR feature tracking during rendezvous and deploy are shown in Figure  12 . Figure 13 shows a split screen of an image of HST during rendezvous from the long range camera and the real-time GNFIR solution computed on the SpaceCube at that given time.
8 RNS also recovered 100,000+ compressed images over the course of the mission using the video compression capability in the SpaceCube. An example of a compressed image that was downloaded during mission operations is shown in Figure 14 . The RNS system infrastructure did not allow for on-orbit FPGA reconfiguration of the SpaceCube, but did allow for software parameters to be updated in the processor card's flash. The AGC parameters in flash were successfully updated to tune the algorithm for the deploy operations. In total, SpaceCube was powered for 60 hours during this mISSIOn.
Two SEUs were detected and successfully repaired by the scrubber. During a routine KU image downlink, the C&DH PowerPC experienced an SEE, at which point it stopped functioning.
The PowerPC WatchDog Timer in the Aeroflex FPGA successfully detected that the heartbeat had stopped and reprogrammed the FPGA, at which point the KU data dump was resumed.
Development Effort-The manpower and schedule to deliver the RNS flight box is significantly greater than subsequent missions using the vl.O system. This development cycle accounted for all of the NRE required when building a new hardware system with supporting FPGA and software. This includes all electrical engineering design, mechanical design, thermal design, radiation and parts engineering, systems design, anti-fuse FPGA design, Xilinx framework core development (PowerPC, SDRAM, scrubber, etc.), and software development for the SpaceRISC and PowerPCs. This phase of the development took two years and required the equivalent of approximately 20 people/year, or 40 man-years. Next, implementing RNS specific applications involved PowerPC software development, Xilinx FPGA core development, intra-box infrastructure testing, independent box verification, environmental testing, and post-delivery support. The RNS implementation phase for SpaceCube was a simultaneous effort that lasted three years and required the equivalent of approximately 10 people/year, or 30 man-years.
B. MISSE-7
A SpaceCube system was launched to the ISS in November 2009 as part of the Materials International Space Station Experiment 7 (MISSE-7) [16, 22] . MISSE-7 is installed on the ISS Express Logistics Carrier (ELC), specifically ELC-2. The main objectives of the MISSE-7 SpaceCube was to (1) demonstrate reliable use of the commercial devices, in this case Xilinx FPGAs and embedded PowerPCs, for a long duration in the space environment, (2) demonstrate continuous and reliable execution of computation-intensive science data applications utilizing SpaceCube's Radiation Hardened by Software (RHBS) technology, (3) demonstrate the ability to reconfigure the FPGA and software with new design files sent from ground.
MISSE-7 Sp aceCube Sy stem-The flight spare hardware from RNS was used to develop this payload. The MISSE-7 payload transmits and receives telemetry and commands to ISS through the Communication Interface Box (CIB) over a RS485 bus with individual experiment hardware enables. Two processor/power slice pairs were configured as independent experiments with separate command and telemetry interfaces. A new MISSE-7 interface slice was required within the SpaceCube modular stack to fulfill all 9 hardware requirements and to support the two independent SpaceCube experiments. The flight box that was delivered was the same physical size as RNS, but only required 28W of power (14W per processor/power slice pair). A high level diagram of MISSE-7 is shown in Figure 15 . FPGA ISo!tware Applications-There is significant re-use of the FPGA and software design from the RNS mission. The initial FPGA design contained framework cores from RNS along with new cores specific to the MISSE-7 experiment. We tested preliminary versions of our RHBS methodologies.
One methodology involves running identical applications in two PowerPC in separate FPGAs. Mirrored C&DH applications on both FPGAs coordinate through the SpaceRISC to execute incoming commands and respond to telemetry requests. Each C&DH app receives and processes incoming commands and requests for telemetry from the CIB and then transmits the parameters to the SpaceRISC. Once the SpaceRISC receives a set of parameters from one C&DH app it sets a timer to wait for parameters from the second C&DH app. The SpaceRISC validates the received parameters. If valid parameters were received from both C&DH apps it grants one of the applications the right to process the parameters based on a round robin approach. If only one valid set of parameters were received in the timeout window it grants the right to process to the app with the valid parameters. If none of the parameters are valid then no rights to execution are given.
This RHBS technique allows the C&DH system to generate telemetry and process command free from error. This also allows the system to operate when one FPGA is down.
Along with the C&DH app, the RHBS demonstration experiment continuously runs a Lunar Lander task using data stored in SDRAM. The Lunar Lander application performs part of the calculations needed for an autonomously controlled vehicle to safely land on the surface of the moon avoiding unsafe terrain at the landing area. The results are sent to the SpaceRISC inside of the Aeroflex. The SpaceRISC software, which had to be modified to support MISSE-7, allows for timing windows that each processor has to send incremental results. Different modes are supported for rolling back a processor task if it fails to send data or if the processors become out of sync. The SpaceRISC also has the ability to completely ignore a processor string. A high-level diagram of how the FPGA embedded system is configured is shown in Figure  16 [5] . New FPGA cores were developed to support the MISSE-7 experiment. The hardware acceleration Sobel Edge core from RNS was slightly modified to support the Lander application. The CIB UART core was designed to handle all CIB communication and significantly assist software in robust packet handling. The spare PowerPCs in each FPGA were utilized to run continuous tasks. An identical task was run in a MicroBlaze processor. Each processing string had command and telemetry capability to the main C&DH application running in the primary PowerPC. Both of these secondary processing systems were clocked by separate redundant DCM structures. Each redundant DCM consists of two DCMs that periodically (apprx. 1 minute) switch control of driving the clock net. Additional logic is in place to detect a DCM string failure, switch over to the redundant DCM, and reset the failed DCM string. A block diagram of the FPGA designs is shown in Figure 17 . It corrects the data before it is presented to the PLB, but does not automatically write the corrected value back to memory. This means that over a long duration, multiple upsets could accumulate. However, when an error is detected in the data, an interrupt is sent to the PowerPC and the memory controller fills a FIFO with the memory address where the corrupted data is. This allows the processor to scrub the memory during idle periods by performing a read operation at the addresses buffered in the FIFO and then writing the data back.
FPGA and Software On-Orbit Reprogamability-Another driving requirement was to support the ability of ground operators to upload new FPGA configuration files, PowerPC software and SpaceRISC software files, then execute a command sequence to reprogram the system. Due to the extremely low communication bandwidth capability to MISSE-7 and the polling schedule of the CIB, this is an extremely tedious and time-intensive process. All files are first compressed on the ground using the GZIP utility.
New software and FPGA configuration files are uplinked to the SpaceCube using ground commands that write the new files to the SpaceCube's onboard Flash memory in 512 byte chunks. A flash write of 512 bytes is comprised of a series of six ground commands. The SpaceCube C&DH app strips out and buffers the data in each of the six commands in SDRAM, using a CRC in the sixth command of the series to validate the 512 byte chunk of data. Once the CRC is validated, it sends a Flash write command to the SpaceRISC containing the chuck of data. It then performs an automatic Flash read command and sends the data to ground in the next telemetry packet so that it can be verified that the flash write executed properly. If one of the six commands is received out of order, the C&DH resets its buffer, reports the error in telemetry and waits for a new series of packets to arrive.
Due to the hundreds to thousands of commands needed to uplink new configuration files, an automatic command generation feature was built into the MISSE-7 ground software application. The ISS Experiment Control Center (IECC) was developed in house at GSFC and is based on the GSFC Instrument remote control framework [8] . To uplink a file to the SpaceCube the IECC user provides a file name, a SpaceCube flash address, and a file offset if the user is in the middle of a file uplink.
The IECC parses the file, calculating the size of the file and the number of packets needed to transfer the file. It then generates a flash block erase command to clear the flash location that will be written to. The IECC then starts generating the six command series needed to uplink one 512 chunk at a time. The IECC only transmits the next command in the series once it receives telemetry that the last was received and validated by the C&DH app. It will automatically retransmit packets to account for dropped packets and Loss of Signal to ISS. When the series of six packets completes, it waits for, then validates against the readback telemetry of the 512 chunk. If the readback is invalid it will log the issue and pause the process for debugging by the user. If the readback is valid it generates the next series of six commands. It continuous this process until the whole file is uplinked.
Once the support files and new compressed configuration and/or software files are uplinked, the user executes a series of commands to reprogram the FPGA and embedded processors.
The support files are encoded with the physical flash address location of the new FPGA and embedded processor configuration files along with other parameters needed to perform a reconfiguration. a file containing a FPGA bitsream that works on both the bottom and top FPGA.
To decompress the new compressed FPGA configuration file or other compressed support file, a special ground command is used. The user populates the flash decompress command with a source and destination flash address. When the C&DH app receives the command from the ground it calculates the file size, and verifies that the compress file is not corrupt, then writes the uncompress data to the new flash location. Last, it reports the outcome of the decompression in its telemetry.
Below is the series of ground commands that are executed to initiate a SpaceCube reconfiguration. The SpaceRISC app will use the new FPGA files that are pointed to by the new RT_Config file. Once the FPGA is configured, the embedded processor boot loader will ask the SpaceRISC for the flash location for its software. The SpaceRISC will provide the new flash addresses provided by new Flash image table. This will result in the new PowerPC software files to be loaded.
To accommodate and mItIgate anomalies in the reconfiguration process, only one FPGA per SpaceCube system is reprogrammed at a time. If an anomaly occurs preventing one FPGA from being reconfigured, the C&DH app will continue to operate nominally on the other FPGA allowing for the reconfiguration to be reverted to the 'Gold' configuration. A power cycle of the SpaceCube also results in a reversion back to all 'Gold' configurations.
Operations-Primary MISSE-7 SpaceCube payload operations are performed at GSFe.
Operations are conducted through MSFC's Huntsville Operations Support Center (HOSC), which manages the telemetry and command links to ISS attached payloads.
Operations are conducted using two main application suites : the HOSC's Telescience Resource Kit (TREK) and GSFC's IECe. TReK serves as gateway to ISS's payload data stream and provides telemetry and command streams from GSFC to the HOSe. The IECC sits on tops of TReK as an advanced secondary payload telemetry and command processor. The IECC is built on GSFC's Instrument Remote Control (IRC) framework. The IECC has the following features :
User generated custom displays via XML Client/Server capabilities supports end users Interactive and automated commanding Real-time event detection and geolocation
Interactive event mapping and IRC plug-in scripting for real-time complex telemetry processing The IECC displays real-time Health and status telemetry, plotting critical temperatures. It has a feature that monitors SEU telemetry which autonomously time stamps and geotags the SEU events.
The SpaceCube on the MISSE-7 payload is shown in Figure  18 . Results-The MISSE-7 SpaceCube payload has been continuously operating for four years at the time of this paper's submission.
We have had only one anomaly on 12/9/12 at 4:59pm EST that required power cycling the payload. In this instance, one of the two SpaceCube experiments appeared to have stopped sending data and was not recoverable through reset and reconfigure commands.
Nominal operations were resumed after the power cycle. There is not enough data to determine if the CIB was involved or if it was solely a SpaceCube problem. No further issues have been observed.
We have not experienced a processor reset as a result of a watchdog timeout. Our data shows that the PowerPC 12 processors have been up and running for more than 99.999% of the time. Further data analysis is needed to confirm 100%.
The overall average SEU rate that we have collected on the four FPGAs is 0.09 SEU/Day/FPGA. A 1O-month sample of where SEUs have occurred are geotagged and depicted in Figure 19 . Each color represents one of the four FPGAs. We have noticed a few scrubber runaway occurrences. The SEU count for a single FPGA starts incrementing at a fast rate. It will last for a period of hours to days. We have not noticed any adverse effects to the underlying applications running.
Updated SEE results will be presented at the conference.
The MISSE7 SpaceCube was an essential part to making MISSE7 a success. The SpaceCube was considered the most reliable experiment and thus was utilized as an indicator to the health of the misse7 payload. During integration testing the SpaceCube also uncovered an anomaly with the CIB that helped characterize operational performance.
The MISSE7 SpaceCube system continues to be a successful and valuable payload because it is a prime showcase of the reliability, flexibility and high-performance of SpaceCube technology.
The SpaceCube team was involved in the full life cycle of this payload, from requirement wntmg, to hardware design, hardware assembly, software development, environmental testing, integration, and post launch operations. This payload has provided significant lessons learned that have laid a strong foundation for all work that followed it.
Development Effo rt-The development cycle for the MISSE-7 box was drastically less than that of the RNS box. This is mainly due to minimal NRE required to build the hardware. The only new piece of hardware was the communication and power adapter slice. This phase of the development took 9 months and required the equivalent of approximately 3.5 people/year, or 3 man-years.
The application development phase took 1 year and required the equivalent of approximately 5 people/year, or 5 man-years. After payload delivery (2/2009), one FPGA and three software updates were made to fix issues that were found during payload testing and to add enhanced features. This system was on a very short delivery schedule in order to meet payload integration milestones. Being that SpaceCube is reconfigurable, it allowed us to meet the delivery deadline by delivering the system with all essential functions, but to continue development for later upgrades.
C. DPPIArgon
The Satellite Servicing Capabilities Office (SSCO) at GSFC began efforts in 2009 to improve the agency's capability to robotically service satellites in space. Two simultaneous flight projects were spawned from this effort (1) Robotic
Refueling Mission (RRM) and (2) The Argon system show in Figure 21 consists of two RNS cameras, a star tracker, a Visual Navigation System (VNS), an Inertial Measurement Unit (IMU), an Infrared camera, a wireless Ethernet module, Power Control Unit (PCV), a suite of situational awareness cameras, and the SpaceCube as the payload avionics and on board processor. 1553, Ethernet, and wireless 802.11 Ethernet are the main communication channels. The main objectives were to demonstrate a robotic AR&D system that couples the functionality of a collection of cameras, sensors, computers, algorithms, and avionics to independently track an uncontrolled target at different ranges. Once the AR&D system has locked onto the target, Argon will safely guide the robot through precise rendezvous and docking maneuvers [19] .
Sp aceCube Hardware Changes-A few modifications were necessary to support the increased requirements of the Argon system. A new Video Compression Module (VCM) slice for the SpaceCube was built to handle the new interface requirements of the situational awareness cameras (NTSC). The VCM is Xilinx-based, which was a huge upgrade in reconfigurability and functional potential compared to the Actel-based VIM slice on RNS. New DCC boards were made to fix timing parameters within the Ethernet circuit to guarantee functionality with ELC. Argon required a significant amount more processing power than on RNS. As a result, all 8 PowerPCs were utilized running the SpaceCube Linux OS. A custom software bus was implemented that utilized the L VDM transceivers to communicate between cards via the internal stacking connector. During application development, Ethernet was used for quickly loading new FPGA configuration and software files to flash storage via the internal PowerPC bus architecture. For the four FPGAs, slice utilization ranged from 80-95% and BRAM utilization ranged from 60-90%. This configuration of the SpaceCube requires 43W of power.
Te sting-Static and dynamic system occurred at GSFC's Satellite ,",pr'", ,,m
Research Laboratory (NRL), and at Lockheed's Space Systems facility in Denver. Argon was successful in demonstrating its stated objectives with a flight-ready system. Figure 22 shows a picture of open-and closed-loop system testing of Argon. The Argon package is attached to the blue Fanuc robot arm on the far left. Argon tracks the motion of a non-cooperative, tumbling satellite, which is the gold mockup mounted on a motion-based Rotopod platform on the far right [18] [19] . Development Effort-The hardware required some NRE to build new DCCs, a new VCM card, and slightly modify the processor card connectors. This phase of the development was programmatically slow, taking 18 months and required the equivalent of approximately 4 people/year, or 6 man years. The application development phase was more involved to accommodate the additional interface requirements and demonstration objectives. It took 2 years and required the equivalent of approximately 9 people/year, or 18 man-years. The Argon system development was very dynamic as the internal architecture was in constant flux. The SpaceCube system was heavily leveraged for its ability to adapt to the changing requirements by reconfiguration of the FPGAs and software.
D. Sp aceCube CIB on STP-H4
The DoD Space Test Program (STP), managed by the Air Force, was responsible for the payload processing of 14 MISSE-7. They were impressed by the reliability and capability of the MISSE-7 SpaceCube during system integration testing and by its on-orbit performance. STP requested that Goddard deliver a SpaceCube vl.O system to replace the legacy CIB system from MISSE-7 for a new ISS payload called STP Houston-4, or STP-H4. The SpaceCube CIB (SC _ CIB) gives the STP-H4 payload the ability to offer experiments higher bandwidth data connections since SpaceCube supports an Ethernet interface compatible with the ISS High Rate Data Link (HRDL) via ELC avionics. STP-H4 is installed on ELC-l. The STP-H4 payload pallet is shown in Figure 23 . The SpaceCube CIB is seen on the bottom right.
For STP-H4, SpaceCube CIB supports six experiments via RS422 interfaces. One of the experiments is called ISS SpaceCube Experiment 2.0 (ISE 2.0), which is a GSFC experimental payload based on an Engineering Model of the SpaceCube v2.0 processing system [23] . Sp aceCube Hardware Description-The SpaceCube CIB is a base system, which as described in section 3 is one processor slice and one power slice [8] . The hardware used for the SC_CIB is a true reflight of one of the processor and L VPC cards from the RNS SpaceCube flight box that flew on Shuttle Atlantis. The DCC board was taken from the flight box developed for the DPP/Argon campaign. A new DCC board was needed to support the Ethernet interface required on STP-H4. The SpaceCube CIB draws 15W of power.
FPGA ISo!tware Application Description-The main objective of the SC_CIB application is to provide a C&DH application between ELC and the payload experiments. A custom C&DH application for the PowerPC was developed for STP-H4.
This application validates and forwards commands to the appropriate payload.
The C&DH application schedules high rate telemetry (HRT) and low rate telemetry (LRT) requests from all payloads in addition to the CIB itself. The main interface is 1553, which is used for commanding, LRT, and health and status data. The SC _ CIB collects health and safety data every second from all attached payloads and its own internal registers such as temperatures, voltages, command and telemetry packet counter, etc., to aid in the operations of the payload. The HRT is sent to the Ethernet interface which operates at a maximum theoretical bandwidth of lOMbps. The high level interfaces of the SpaceCube CIB are depicted in Figure 24 .
The SC _ CIB FPGA design leveraged heritage cores and overall embedded architecture from prior missions, which was crucial in allowing for a fast application development cycle required to meeting the ambitious delivery schedule. The interrupt controller, USART, and scrubber cores are from RNS. The SDRAM DECTED EDAC core is from MISSE-7. The 1553, Ethernet MAC, and Ethernet PHY cores are from Argon. Likewise for software, design heritage was a key component in signing up for the fast 15 delivery schedule. The Linux OS framework from RNS was used with all supporting FPGA core drivers. The SpaceRISC updates from MISSE-7 were incorporated to enable on-orbit reconfiguration of the FPGA. The C&DH application incorporates the flash file support and compression/decompression software from MISSE-7 that is also required to support on-orbit FPGA reconfiguration and software updates. The 1553 and Ethernet drivers were used from Argon.
Two new FPGA cores with supporting software drivers were developed to meet the CIB requirements. The TimeCore keeps an internal system time that is synchronized with the ISS broadcast time at a rate of 1Hz. This timestamp is included in all data packets sent to the attached payloads. The Time core is accurate to 1 byte of fine time, which is approximately 4ms.
The second core that was developed for CIB is the Payload Interface Core that is used to communicate with each attached payload via RS422. It validates incoming packets, searches for the sync header, validates header fields, and checks for a valid CRe. It strips out payload data and presents packet statistics to the C&DH software via a series of flags. It also generates all packets transmitted to the payloads. The software writes the desired packet type to a register and if it's a command it puts the command payload into a FIFO. The core then generates the packet header, fills in the payload data from the FIFO, and appends the calculated CRe. The core also manages payload response timeouts, by setting timers after packet transmissions and notifies software if the timer expired before a valid response was received. This core utilizes the TimeCore to timestamp all the packets sent to the payloads. It latches the time as it creates the packet to reducing latency to only the packet transmit time. At a high level it abstracts the payload interface to the software as a series of flags and payload data in FIFOs. This reduces the load on the software, allowing the system to quickly collect and transmit the data to the ISS.
Only one FPGA and one PowerPC were used to implement the CIB requirements.
The FPGA design utilized approximately 40% logic and 30% BRAM resources. To avoid the accumulation of SEUs that could cause potential issues, a design that only contains the configuration scrubber is implemented on the second FPGA.
CIB Te sting and In tegration-The SpaceCube CIB went through environmental testing at GSFC prior to delivery to STP-H4 in Houston, TX. GSFC continued to support the system level tests and integration. A 1553 Remote Terminal address bug was uncovered during 1553 validation testing. The software patch to fix the bug was tested at GSFC and the SpaceCube CIB in Houston reprogrammed within 48 hours of discovering the issue. A second software patch was later performed to improve overall functionality as a result of ongoing testing at GSFC.
Following flawless system integration in Houston, the payload was sent to KSC. A risk reduction payload test was performed that included validating communication with the ELC Ethernet interface. After correcting a minor issue in the harness, the SpaceCube CIB successfully streamed 1.2 GB of data at an effective rate of 1.5Mbps. Environmental testing on the system and fm al end-to-end tests occurred prior to shipment to Japan for launch vehicle integration.
The FPGA design was locked after environmental testing at GSFC and never required an update post-delivery. The option to reconfigure the FPGA on-orbit exists if necessary.
Op erations-The STP-H4 payload was launched to ISS on the JAXA HTV-4 vehicle in August 2013. The payload was activated shortly after arrival.
The SpaceCube CIB's telemetry is being monitored with the IECC and it has been operating nominally.
All temperatures, voltages, and statuses are as expected. All attached payload's telemetry and commands are being transmitted without error.
The GSFC IECC has the capability to command and monitor the STP-H4 payload. The SC_CIB has successfully been sent commands to reset status and its internal counters. SEE results will be compared to those of the MISSE-7 SpaceCube, and presented at the conference.
Development Effort-The agreement with STP-H4 put the SpaceCube CIB on a strict 12 month delivery schedule. The hardware did not require any NRE. Thus, the hardware build phase of the development was fast. It only took 11 months to build, test, and deliver the hardware. This phase required the equivalent of approximately 3 people/year, which is roughly 3 man-years. The application development phase required more people to implement the CIB-specific FPGA and software requirements. It took 12 months and required the equivalent of approximately 5 people/year, or 5 man-years. After delivery, the STP-H4 system integration required 2 people for 6 months, which increases the total application effort to 6 man-years. The hardware reuse, FPGA/software design heritage, and reconfigurable options of the SpaceCube allowed us to confidently deliver a product within the aggressive schedule requirement. The reconfigurability of the system was utilized after delivery to fix issues found during payload integration. 
CONCLUSIONS
SpaceCube fits the need for a hybrid computing architecture for space. We have demonstrated reliable use in three separate missions including over four years of operation on the MISSE-7 payload.
The computing power of the SpaceCube system provides at least a lOx performance increase over traditional space processors. We have shown how we have solved extreme data-intensive and computation-intensive applications within RNS and Argon by leveraging a multi-processing platform coupled with reconfigurable FPGAs.
Traditional space processing systems cannot handle these advanced applications. The SpaceCube hybrid processing system enables break-through mission objectives such as AR&D and robotic servicing.
In addition, the SpaceCube is both reconfigurable and modular. We have shown how we have used these traits to quickly adapt to new missions and changing requirements. Each mission, aside from the SC_CIB, required a mission unique VO card to meet requirements.
On the MISSE-7 SpaceCube, we have proven the ability to reconfigure the system in space flight with new FPGA and software design files sent from ground. The flexibility of SpaceCube allowed for an ad-hoc collaborative effort to be utilized in developing the new versions of software and FPGA designs that were used to reprogram it in space.
Within this paper we have also highlighted the development effort to build each of the systems in Section 4. This data is summarized in Figure 25 and Figure 26 . The hardware NRE, FPGA NRE, and software NRE are significantly reduced after the RNS mission. Each of the follow-on missions only required engineering to build and test copies of the hardware, develop mission unique I/O cards, and integrate the new application requirements in FPGA and software. This reduction in NRE has a great benefit to program cost and schedule. The reuse and application of SpaceCube to different mission profiles is only possible due to its reconfigurability and modularity. As shown for the SC_CIB mission, schedule risk was reduced due to heritage design, hardware reuse, and the reconfigurable FPGAs and software features of the SpaceCube.
These combined features are what enabled our confidence in delivering a working system within 12 months.
Development Time

�--------------------------
35
+-�&a---------------------- The reduction of initial design NRE cost, the flexibility of the hybrid architecture, and the inherent low power and weight of the SpaceCube system is what makes the SpaceCube attractive to missions requiring an advanced avionics package.
FUTURE WORK
GSFC is currently supporting two new programs that will use a SpaceCube vl.O system for on-board payload processing. GSFC is delivering another SpaceCube CIB for STP-H5, which is a follow-on project to STP-H4. The hardware will be identical to STP-H4, but will require some software modifications to support new experiments, including file transfer. Also, SSCO will use a SpaceCube vl.O system to control the third phase of RRM. The RRM-3 SpaceCube will require an I/O card to handle analog monitoring and control of the payload systems, and will also require an added Ethernet interface to communicate with the ISS wireless 802.11 network.
