Abstract-Los Alamos National Laboratory has designed and manufactured a single-board computer (SBC) for deployment in space-flight applications. The SBC is designed to meet the command-and data-handling requirements for missions requiring true space-grade radiation hardness and fault tolerance, exceeding those that are typical in CubeSat and SmallSat applications but at a substantially lower cost, lower power, and smaller form factor than encountered in the spacegrade solution space available from the large aerospace manufacturers. The SBC leverages the MicroTCA standard, which makes it smaller than common 6U-sized solutions but still allows interoperability with a diverse ecosystem of commercial development equipment. The design uses QMLV or Class-S parts with total-ionizing-dose tolerance appropriate for deployment on long-term missions in MEO or GEO environments. The design uses a space-quality dual-core processor ASIC, a field-programmable gate array (FPGA), memories, and interfaces to meet the command-and datahandling requirements of medium-sized missions. Consuming only 5 W and measuring less than 7 inches x 6 inches, the design supports 9-gigabit/s class bidirectional SerDes links, 6 SpaceWire ports, redundant MIL-STD-1553B ports, 32 Mbytes of EDAC protected SRAM, 2 GBytes of nonvolatile memory, and supports 200 MFLOPS operation. Custom intellectual property (IP) has been developed for the FPGA to handle the interface to the nonvolatile memories and to provide error detection and correction for instrument boards elsewhere in the payload. The processor can run VxWorks™, RTEMS, or Linux. Cooling and mechanical hardness is achieved with a custom conduction-cooled frame that fits around the MicroTCA-form-factor printed circuit board.
INTRODUCTION
The Los Alamos Single-Board Computer (SBC) is a custom design intended to function as the payload processor for a set of instruments in medium Earth orbit (MEO) and geosynchronous Earth orbit (GEO) applications. Previous similar missions have used 6U-sized (160 mm x 233 mm) CompactPCI (cPCI) processors based on the BAE RAD750™ for the command-and data-handling function. Increased pressure from our customer for smaller size, lower weight, and lower power (size, weight, and power or SWaP) led the Los Alamos design team to adopt the smaller form factor of MicroTCA and a lower power computing architecture [1] . The design team was not able to find a standard product offering from industry for a space-grade, radiation-tolerant processor board that met our requirements. This paper will describe the motivations, design, architecture, application, and capabilities of the Los Alamos Single-Board Computer. A block diagram of the SBC is shown below in Figure 1 .
MOTIVATION
The Intelligence and Space Research Division at Los Alamos National Laboratory (LANL) designs and implements instrument systems for use on satellites and in other space applications. These instrument systems often consist of a suite of sensor boards connected through a backplane to a processor board. The processor board is responsible for configuring the sensor boards, reading data from the sensor boards, and sending that data in a packaged format to the space-vehicle host. The processor board must also accept commands from the spacecraft and send state-of-health information to the spacecraft. The design team had a mandate to reduce SWaP for the entire instrument system, including the single-board computer described in this paper.
Size
Legacy designs were based on the 6U cPCI form factor [2] . The Los Alamos team leveraged the commercial cPCI standard so that prototypes could be developed rapidly using low-cost commercial cPCI hardware that was functionally interchangeable with mechanically ruggedized and radiationhardened flight-grade modules. Leveraging the commercial cPCI standard and then hardening it for space flight applications allowed the Los Alamos team to reuse intellectual property (IP), such as interfaces, from the commercial world. Although the Los Alamos team put considerable effort into converting the commercial cPCI standard (both mechanically and electrically) into something that can be flown in space, the head start that the standard gave us saved us and the customer time and money.
CompactPCI uses a synchronous parallel bus architecture that is limited to 66 MHz. While this is adequate for our current applications, the industry is replacing parallel bus architectures with high-speed serial protocols. Recent advances in technology are bringing multi-gigabit SerDes to space-grade components. For these reasons, the design team wanted to find a well-accepted commercial standard that was smaller than 6U cPCI and was based on a high-speed serial backplane fabric. MicroTCA was chosen because it is smaller than cPCI but not so small that a reasonable number of typical space-grade components will no longer fit on the printed circuit board. MicroTCA is shorthand for the Micro Telecommunications Computing Architecture and is widely used in the telecom industry. It is part of a family of standards under the Advanced Telecommunication Computing Architecture (ATCA) and is maintained by the PCI Industrial Manufacturers Group (PICMG). ATCA systems support the use of Advanced Mezzanine Cards (AMCs) that reside on carrier cards. MicroTCA is a compact adaptation of ATCA that does away with carrier cards and allows users to insert AMCs directly into backplanes. AMCs come in several form factors. We selected the "double-wide", "midsize" standard for our applications. The card pitch in both cPCI and MicroTCA-midsize is 0.8 inches, but the area of the MicroTCA card is 38 inches 2 as opposed to 57 inches 2 for 6U cPCI [3] . Using MicroTCA instead of cPCI allowed us to shrink the size of our boards by 33% while still allowing us to prototype and interoperate with low-cost commercial backplanes, enclosures, and companion boards. This arrangement also allowed us to use commercial diagnostic equipment and protocol analyzers to debug interfaces. Since MicroTCA has a ruggedized standard (MicroTCA.3), the mechanical team was able to leverage mechanically hardened connectors and model their flight-grade conduction-cooled frames and enclosures on this pre-exiting standard [4] . This reduced our time to delivery.
Weight
The smaller size of MicroTCA, when applied to the entire package of instruments and not just the payload processor, allowed us reduce the weight of the overall payload by 45%. 
Power
The legacy RAD750 design consumed 14 W compared to 5 W for the new design. The RAD750 has a higher computational bandwidth compared to the new design, but, at 200 MFLOPS, the new design has more than adequate capability for our application [5] .
Cost
The SBC is a low-cost processing platform. The recurring engineering cost is approximately $130K. This includes flight grade (QMLV and Class S) components, fabrication of the printed circuit board (PCB), and assembly. This also assumes minimum order quantities (MOQs) are met. Lower costs can be achieved using lower quality, but still pincompatible parts.
ARCHITECTURE
The SBC architecture is based on a processor applicationspecific integrated circuit (ASIC) that is supported by external static random-access memory (SRAM), non-volatile memory, a field-programmable gate array (FPGA), and board-level physical layer interfaces.
This SBC can communicate directly with a host space vehicle over both MIL-STD-1553B and SpaceWire interfaces. To meet MicroTCA standards it must also support serial interfaces on its backplane connectors. The serial interfaces include I 2 C and multi-gigabit SerDes-based protocols. The design also supports discrete I/O interfaces that can be connected to other boards on the backplane or to the space vehicle. The SBC accepts space-vehicle clocks that can be used for system timing.
The processor ASIC executes flight software and is at the top of the command-and data-handling control hierarchy. The processor occupies the top level of control for this board. It boots from code stored in the external non-volatile memories, transfers flight software to SRAM, and then begins execution of flight software from SRAM.
The processor ASIC has built-in hardware support for SpaceWire, error detection and correction (EDAC) on external memories, and the MIL-STD-1553B encoding and decoding standards.
The FPGA provides a platform for custom hardware coprocessing functions to reduce the load on the processor ASIC. The FPGA also has built-in physical layer support for high-speed serial, LVDS, I
2 C, and JTAG (Joint Test Action Group) interfaces [6] . These capabilities are hard IP blocks instantiated in the FPGA that can be accessed by soft designs in the user fabric. In Figure 1 , the blocks labeled IPMB-L and IPMB-0 are I 2 C links for management of other cards in the system via the Intelligent Platform Management Interface (IPMI) protocol. The "common options" block represents the optional high-speed serial protocols supported within the standard, such as Gigabit Ethernet.
The SBC has deep non-volatile memory that is designed to store configuration data for SRAM-based FPGAs hosted on other boards within the system. Due to the volatility of their programming memory, these FPGAs must be programmed after power up. The SBC has hardware, firmware, and non- volatile memory to support the direct configuration of these FPGAs and, during operation, performs automated error detection and correction for errors found in the FPGAs' SRAM programming memory due to radiation-induced single-event effects. Figure 2 illustrates the programming of SRAM-based FPGAs on sensor boards in the instrument system from local SBC flash memory.
The SBC is mapped onto the MicroTCA standard and can perform as a simple MicroTCA Carrier Hub (MCH) or as a standard AMC card [7] . A photo of the SBC is shown in Figure 3 .
DESIGN Processor
The SBC is based on a dual-core LEON3 processor application-specific integrated circuit (ASIC). The ASIC is the Cobham Gaisler GR712RC™, which is space qualified, radiation hardened, and fault tolerant. Manufactured on 180-nm CMOS, the GR712RC is radiation tolerant to 300 krad. It has internal Advanced Microcontroller Bus Architecture (AMBA) buses with peripherals that support external memory, JTAG-based debugging, 6 SpaceWire ports, 10/100 Ethernet, and MIL-STD-1553B. The GR712RC has a floating-point unit for each of its two cores. We clock both cores at 100 MHz, which yields a 200 MFLOP (floating-point operations per second) theoretical maximum performance.
The processor's internal and external memories are protected from single-event effects with error detection and correction (EDAC) circuitry.
Static Memory and Bus Interface
The highest performance memories in this system are the two 20-Mbyte SRAMs. The SRAMs are organized with a bus width that is 40-bits wide instead of 32, which is optimized for the error detection and correction (EDAC) hardware built into the processor ASIC. The processor reads or writes 32 bits of data, plus another 7 EDAC bits in a single operation to the SRAMs. Since 7 EDAC bits are needed to cover a 32-bit data word for single-error detection and double-error detection (SECDED)-for a total of 39 bits, one bit in every 40-bit word is unused. After the EDAC bits are accounted for, the usable SRAM memory space is 32 Mbytes of protected memory.
To reduce capacitive bus loading and improve system timing, a bus switch was added to the design between the processor ASIC and the two SRAM memories. The bus switch behaves as a set of low-delay FET switches that connects one memory IC or the other to the processor, but not both at the same time. This cuts the capacitive load on the processor bus by almost a factor of two. 
Flash Memory Interface
The SBC provides two 8-gigabit NAND flash parts from 3D-Plus that are radiation-tolerant [8] . The VHDL flash memory controller was developed to allow the processor to easily read and write single bytes (or multiple bytes) from a flash memory that would otherwise require a complex sequence of commands to erase, write, and read back entire 4-Kbyte blocks of memory at a time. This way, the processor interfaces with the flash memory as though it was an SRAM and the hardware-intensive activity of managing entire block reads and writes is handled by the FPGA. When the processor makes a single-byte write, the FPGA reads the entire 4Kbyte block into its memory, modifies the byte being written, erases the flash block being accessed, and then writes the entire block back to flash. This entire operation is hidden from the processor. In this manor, the FPGA acts as a hardware co-processor for flash interactions. Unlike other forms of non-volatile memory, NAND flash is shipped from the manufacturer with some blocks that do not work properly. Blocks have additional "spare" bits associated with them that can be used to mark them as bad blocks. Bad blocks can also develop during the lifetime of the memory. The existing IP sets a status bit that indicates to the processor that a requested block is good or bad. Depending on the application, the design team will use a bad-block management scheme that either skips bad blocks or uses a more advanced block replacement method. This VHDL will be developed in the near future.
I 2 C Interfaces
Inter-IC (I 2 C) is a two-wire serial bus protocol that is commonly used for platform management [9] . I 2 C typically runs at 400 kbit/sec. In MicroTCA, I
2 C buses are required for initialization of power modules and AMC cards, to perform simple control, to request board information, and to request board state-of-health data. The I 2 C master is instantiated in the FPGA and is interfaced to the processor through the I/O space just like the flash memory controller. The SBC is also capable of acting as an I 2 C slave.
SpaceWire
As an extension to standard MicroTCA interfaces, the design team added 5 SpaceWire ports to the backplane and one to the front panel. Since SpaceWire uses LVDS (Low-Voltage Differential Signaling), it is natively compatible with MicroTCA backplane standards. These six ports can be connected directly to the SpaceWire endpoints that are built into the GR712RC processor via the FPGA's LVDS transceivers or through a SpaceWire router instantiated in the FPGA. The design team is presently porting existing SpaceWire router IP to the FPGA. We estimate logic utilization below 10% for this purpose.
SRAM-based FPGA Programming and Scrubbing Support
MicroTCA supports an IEEE 1149.1 (also known as JTAG) interface for each AMC board in the MicroTCA system. These JTAG interfaces connect from each AMC board to the MicroTCA Carrier Hub (MCH) and can be used to configure SRAM-based FPGAs on other cards in the system. In the space environment, these SRAM-based FPGAs must be scrubbed for bit errors in their configuration memory. Los Alamos National Laboratory has developed VHDL IP for the SBC's FPGA to configure SRAM-based FPGAs on other AMC cards and detect errors in their programming data. The IP abstracts the details of programming the FPGAs over JTAG from the processor and provides continuous, automated error detection for the SRAM-based FPGAs' programming data without intervention from the processor. When a bit error is detected in the configuration memory of an off-board SRAM-based FPGA, the processor will receive a maskable interrupt. The processor software can then make a decision on how to handle the error, such as using the JTAG interface to fix the portion of programming data affected by the error.
Non-Volatile Memory
In addition to the flash storage, the SBC has two other types of non-volatile memory. The processor boot loader resides in a 32Kx8-bit radiation-hardened PROM. Program code is stored in an 8-Mbyte magneto-resistive RAM (MRAM) and is copied over to SRAM before execution begins. The MRAM and PROM are 8-bits wide and are buffered by the FPGA. This was done because the MRAM has a high capacitive load compared with the FPGA and also so that the FPGA can lock or unlock portions of the MRAM to protect program code from being overwritten.
IMPLEMENTATION
The Mentor Graphics design chain was used for schematic capture, printed circuit board layout, and signal and power integrity modeling.
Starting with the schematics, the 4500+-net design was managed in a hierarchical fashion so that sub-designs, such as those for SRAM and FLASH memory, can be designed once and instantiated more than once on the printed circuit board. This methodology reduces design effort and errors by allowing the designer to draw and edit a schematic for a single SRAM, for example, just once and then have those edits reflected on other instances of the SRAM hierarchical block automatically. A top-down hierarchical block also makes the design easier to understand, navigate, check, and use for troubleshooting once the design is realized and on the bench. The design team used Mentor's DxDesigner™ to draw the schematics and Mentor's CES™ constraints entry tool to capture design constraints, such as matched length busses, differential pairs, and controlled and differential impedances.
Layout was completed on 14 layers with Mentor's Expedition™ layout tool. This tool is used to create a board "stack up", which determines the number of layers, the thickness of the laminate, and the weight of the copper. A combination of auto-routing and manual techniques was used to route the traces on the printed circuit board.
The Mentor HyperLynx™ signal and power integrity tools were used to model the design prior to manufacture. Signal integrity for critical signals, such as clocks and SRAM bus signals, were modeled using the post-layout geometry of the PCB. Power plane impedances were also simulated. This type of analysis allows the design team to identify and correct deficiencies before the PCB is fabricated, which improves the probability that the design will function as intended on the first try.
Mechanicals
The printed circuit board is manufactured to the IPC-6012-Class 3 standard and constructed from Isola FR408HR™ laminate, where the design team found a good trade between high-speed and high-temperature performance [10] [11] . Typical copper weights vary from 0.5 to 1 ounce.
As noted earlier, the mechanical engineering team leveraged the MicroTCA.3 conduction-cooled specification as a starting point for the mechanical design. The SBC is housed in a conduction-cooled frame manufactured from 6061-T6 aluminum and uses wedge locks from WaveTherm™. The design is qualified to mission specific shock, vibration, and thermal requirements [12] . These requirements typically exceed those published in NASA's General Environmental Verification Standard [13] . A solid model of the conduction cooled frame and cover is shown in Figure 4 .
STATUS

Board Revisions
Two revisions of the SBC have been designed. The first revision was designed prior to the availability of the RTG4 FPGA and utilized a Microsemi RTAX™ FPGA. The lowdelay bus switch was also unavailable and active transceivers with a higher propagation delay were used instead. This first version is otherwise very similar to the final design and is being used as a software and firmware development platform as well as for functional integration testing against other modules in the instrument suite.
The final revision of the printed circuit board has been fabricated and testing is currently underway. Major components such as the GR712 processor, RTG4 FPGA, and memories are working properly. Capabilities such as SpaceWire, MIL-STD-1553B, and I 2 C interfaces have been demonstrated.
Figure 4. Conduction Cooled Enclosure
FPGA Fabric Utilization
The firmware developed for the original version has been ported to the RTG4 fabric. Fabric utilization numbers appear in Table 1 . With only 15% of the fabric utilized, there is room for additional capability to be added to the FPGA. 
SpaceWire Router 10%
Total 15%
Processor Performance
While developing the SBC, Los Alamos has been able to perform some tests on floating-point performance for the GR712RC. A single GR712RC core will complete a 1024-point double-precision FFT in about 426 microseconds under VxWorks. This is approximately four times slower than a RAD750 executing the same FFT at the same clock rate. This is because the RAD750 has a 64-bit internal bus as opposed to the 32-bit internal bus on the GR712RC, so the GR712RC needs twice as many cycles to pull data from internal registers. Further, the RAD750 can perform a doubleprecision multiply-accumulate as a single operation, where the GR712RC requires two double-precision floating-point operations to perform the same calculation. These two factors combine for the 4x performance difference on a coreby-core comparison. This single-core performance gap closes to 2x for single-precision performance at the same clock rate because the GR712RC does not experience a penalty for single-precision floating-point data transfers over its internal data bus. With the RAD750 running at 200 MHz and the GR712RC running at 100 MHz, the chip-level floating-point peak performance remains 4x and 2x for double-and single-precision, respectively, assuming both cores of the GR712RC are fully utilized and the RAD750's single core is fully utilized.
FUTURE WORK
The design team will continue testing and integration of the current revision of the SBC during 2016. To support future payloads and applications, the team is also evaluating the latest class of advanced quad-core LEON4 processors from Cobham for future SBC designs.
Summary
The LANL SBC is a low-cost, low-power payload processor board that is suitable for use on medium-sized missions.
With its floating-point support and multi-core architecture, it can be used as for command and data handling, processing of mission data, or both. Built-in support for MIL-STD-1553B and SpaceWire allow for easy interfacing with many commonly used space vehicle busses.
