    Fault and Defect Tolerant Computer Architectures: Reliable Computing With Unreliable Devices

    This research addresses design of a reliable computer from unreliable device technologies. A system architecture is developed for a fault and defect tolerant (FDT) computer. Trade-offs between different techniques are studied and yield and hardware cost models are developed. Fault and defect tolerant designs are created for the processor and the cache memory. Simulation results for the content-addressable memory (CAM)-based cache show 90% yield with device failure probabilities of 3 x 10(-6), three orders of magnitude better than non fault tolerant caches of the same size. The entire processor achieves 70% yield with device failure probabilities exceeding 10(-6). The required hardware redundancy is approximately 15 times that of a non-fault tolerant design. While larger than current FT designs, this architecture allows the use of devices much more likely to fail than silicon CMOS. As part of model development, an improved model is derived for NAND Multiplexing. The model is the first accurate model for small and medium amounts of redundancy. Previous models are extended to account for dependence between the inputs and produce more accurate results

    Heterogeneous Reconfigurable Fabrics for In-circuit Training and Evaluation of Neuromorphic Architectures

    A heterogeneous device technology reconfigurable logic fabric is proposed which leverages the cooperating advantages of distinct magnetic random access memory (MRAM)-based look-up tables (LUTs) to realize sequential logic circuits, along with conventional SRAM-based LUTs to realize combinational logic paths. The resulting Hybrid Spin/Charge FPGA (HSC-FPGA) using magnetic tunnel junction (MTJ) devices within this topology demonstrates commensurate reductions in area and power consumption over fabrics having LUTs constructed with either individual technology alone. Herein, a hierarchical top-down design approach is used to develop the HSCFPGA starting from the configurable logic block (CLB) and slice structures down to LUT circuits and the corresponding device fabrication paradigms. This facilitates a novel architectural approach to reduce leakage energy, minimize communication occurrence and energy cost by eliminating unnecessary data transfer, and support auto-tuning for resilience. Furthermore, HSC-FPGA enables new advantages of technology co-design which trades off alternative mappings between emerging devices and transistors at runtime by allowing dynamic remapping to adaptively leverage the intrinsic computing features of each device technology. HSC-FPGA offers a platform for fine-grained Logic-In-Memory architectures and runtime adaptive hardware. An orthogonal dimension of fabric heterogeneity is also non-determinism enabled by either low-voltage CMOS or probabilistic emerging devices. It can be realized using probabilistic devices within a reconfigurable network to blend deterministic and probabilistic computational models. Herein, consider the probabilistic spin logic p-bit device as a fabric element comprising a crossbar-structured weighted array. The Programmability of the resistive network interconnecting p-bit devices can be achieved by modifying the resistive states of the array\u27s weighted connections. Thus, the programmable weighted array forms a CLB-scale macro co-processing element with bitstream programmability. This allows field programmability for a wide range of classification problems and recognition tasks to allow fluid mappings of probabilistic and deterministic computing approaches. In particular, a Deep Belief Network (DBN) is implemented in the field using recurrent layers of co-processing elements to form an n x m1 x m2 x ::: x mi weighted array as a configurable hardware circuit with an n-input layer followed by i ≥ 1 hidden layers. As neuromorphic architectures using post-CMOS devices increase in capability and network size, the utility and benefits of reconfigurable fabrics of neuromorphic modules can be anticipated to continue to accelerate

    The Fifth NASA Symposium on VLSI Design

    The fifth annual NASA Symposium on VLSI Design had 13 sessions including Radiation Effects, Architectures, Mixed Signal, Design Techniques, Fault Testing, Synthesis, Signal Processing, and other Featured Presentations. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The presentations share insights into next generation advances that will serve as a basis for future VLSI design

    The effects of ionising radiation on implantable MOS electronic devices

    Space exploration and the rapid growth of the satellite communications industry has promoted substantial research into the effects of ionising radiation on modem electronic technology. The enabling electronics and computer processing has seen a commensurate growth in the use of radiation for diagnostic and therapeutic purposes in medicine. Numerous studies exist in both these fields but an analysis combining the fields of study to ascertain the effects of radiation on medically implantable electronics is lacking. A review of significant ground level radiation sources is presented with particular emphasis on the medical environment. Mechanisms of permanent and transient ionising radiation damage to Metal Oxide Semiconductors are summarised. Three significant sources of radiation are classified as having the ability to damage or alter the behavior of implantable electronics; Secondary neutron cosmic radiation, alpha particle radiation from the device packaging and therapeutic doses of high energy radiation. With respect to cosmic radiation, the most sensitive circuit structure within a typical microcomputer architecture is the Random Access Memory(RAM). A theoretical model which predicts the susceptibility of a RAM cell to single event upsets from secondary cosmic ray neutrons is presented. A previously unreported method for calculating the collection efficiency term in the upset model has been derived along with an extension of the model to enable estimation of multiple bit upset rates. An Implantable Cardioverter Defibrillator is used as a case example to demonstrate model applicability and test against clinical experience. The model correlates well with clinical experience and is consistent with the expected geographical variations of the secondary cosmic ray neutron flux. This is the first clinical data set obtained indicating the effects of cosmic radiation on implantable devices. Importantly, it may be used to predict the susceptibility of future implantable device designs to cosmic radiation. The model is also used as a basis for developing radiation hardened circuit techniques and system design. A review of methods to radiation harden electronics to single event upsets is used to recommend methods applicable to the low power/small area constraints of implantable systems

    Characterizing Single Event Upsets within the lpGBT-based End-of-Substructure Card

    The CERN ATLAS particle physics experiment is currently undergoing a significant system upgrade (ATLAS Phase II upgrade). As a result of the upgrade the experiment's Inner Tracker (ITk) and the front-end electronics of the ITk are being redesigned to handle increased data rates and a higher radiation environment. Within the ITk, the End Of Substructure (EoS) card is a new custom designed digital board that will provide the data, command, and power interface between on and off-detector electronics. Each EoS card makes use of one or two custom CERN designed low power Gigabit Transceivers (lpGBTs) ASICS that have been created for the purposes of supporting high bandwidth optical links in high radiation environments throughout CERN experiments. An estimated 1552 EoS cards will be installed in the ITk, each representing a potential point of failure. Given the complexity and quantity of new hardware designs involved, and that the EoS cards will be not be accessible or serviceable after the upgrade has been completed, there is a need for rigorous quality assurance (QA) and quality control (QC) testing. This thesis therefore describes an independent test setup commissioned, by the author, at the University of Cape Town (UCT) Physics Department for characterising aspects of EoS card's operation under representative radiation conditions. Specifically, the radiation environment of the ITk poses a challenge to electronics as energetic particles can deposit their energy within the circuit material resulting in an erroneous change in logic known as a Single Event Upset (SEU). The lpGBT is a radiation tolerant ASIC and employs digital signal processing (DSP) and triple modular redundancy (TMR) techniques to mitigate against the effects of SEUs on transmitted data. This thesis presents an experiment setup which tests this hypothesis that the DSP stages are susceptible to data corruption caused by SEUs. In addition the setup also attempts to characterize the susceptibility of the scrambler, encoder, and interleaver stages within the lpGBT to SEUs. This experiment is carried out by actively irradiating an EoS card with a neutron source (energy spectrum of up to 11 MeV), while emulating each stage on a non-irradiated off-board FPGA. Additionally and in support of this experiment, the existing firmware and LabView automation software developed at DESY are extended. Results from this thesis indicate that the DSP stages within the lpGBT are susceptible to data corruption caused by SEUs. It was also shown that the susceptibility of the experiment itself did not effect the measured SEU rates. Finally, preliminary results suggest that susceptibility of the DSP stages within the lpGBT can be characterized as the Bit Error Rate (BER) increases depending on the number of active stages