150 research outputs found

    Fault and Defect Tolerant Computer Architectures: Reliable Computing With Unreliable Devices

    Get PDF
    This research addresses design of a reliable computer from unreliable device technologies. A system architecture is developed for a fault and defect tolerant (FDT) computer. Trade-offs between different techniques are studied and yield and hardware cost models are developed. Fault and defect tolerant designs are created for the processor and the cache memory. Simulation results for the content-addressable memory (CAM)-based cache show 90% yield with device failure probabilities of 3 x 10(-6), three orders of magnitude better than non fault tolerant caches of the same size. The entire processor achieves 70% yield with device failure probabilities exceeding 10(-6). The required hardware redundancy is approximately 15 times that of a non-fault tolerant design. While larger than current FT designs, this architecture allows the use of devices much more likely to fail than silicon CMOS. As part of model development, an improved model is derived for NAND Multiplexing. The model is the first accurate model for small and medium amounts of redundancy. Previous models are extended to account for dependence between the inputs and produce more accurate results

    Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack

    Get PDF
    Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria

    Method of Testing and Diagnosing Field Programmable Gate Arrays

    Get PDF
    A method of testing field programmable gate arrays (FPGAs) includes establishing a first group of programmable logic blocks as test pattern generators or output response analyzers and a second group of programmable logic blocks as blocks under test. This is followed by generating test patterns and comparing outputs of two blocks under test with one output response analyzer. Next is the combining of results of a plurality of output response analyzers utilizing an iterative comparator in order to produce a pass/fail indication. The method also includes the step of reconfiguring each block under test so that each block under test is tested in all possible modes of operation. Further, there follows the step of reversing programming of the groups of programmable logic blocks so that each programmable logic block is configured at least once as a block under test

    From FPGA to ASIC: A RISC-V processor experience

    Get PDF
    This work document a correct design flow using these tools in the Lagarto RISC- V Processor and the RTL design considerations that must be taken into account, to move from a design for FPGA to design for ASIC

    Reduced Galloping Column Algorithm For Memory Testing

    Get PDF
    Memory testing is significantly important nowadays especially in SOC’s design, due to their rapid growth in the memory density and design complexity in smaller chip area and low power design. Thus, test time in memory testing is a key challenge to accelerate time to market, high yield and low test cost in high volume manufacturing. Test time reduction in memory testing is important in industry, as test cost is directly related to validation time of each product on the tester. There are lots of memory algorithms used for memory testing, including the galloping column algorithm (GalCol). The GalCol algorithm test is important to detect unique coupling and transition faults. However, the existing GalCol algorithm takes huge test time due to its test complexity. To overcome the test time issue in industry, reduced GalCol algorithms with solid data background are proposed. The reduced GalCol algoritms have similar test behavior as original GalCol algorithm with major difference in the number of galloping of the target cells. The galloping of target cells are reduced to first and last 8, 16 and 32 of cells of every base cell. This project is progressed in two stages, which are the software development using INTEL software and Synopsys tool and test implementation on INTEL production flow. These algorithm are verified on 15 units of 64KB L2 SRAM memory. In this project, test time reduction and consistent pass fail test results are achieved in the reduced GalCol algorithm tests. The GalCol X8 algorithm obtains the highest test time reduction of about 79.5% at 600MHz and 75.7% at 1.6GHz with consistent pass or fail test results comparable to original GalCol algorithm in the HVM test flow

    Dynamic partial reconfiguration for pipelined digital systems— A Case study using a color space conversion engine

    Get PDF
    In digital hardware design, reconfigurable devices such as Field Programmable Gate Arrays (FPGAs) allow for a unique feature called partial reconfiguration PR). This refers to the reprogramming of a subset of the reconfigurable logic during active operation. PR allows multiple hardware blocks to be consolidated into a single partition, which can be reprogrammed at run-time as desired. This may reduce the logic circuit (and silicon area) requirements and greatly extend functionality. Furthermore, dynamic partial reconfiguration (DPR) refers to PR that does not halt the system during reprogramming. This allows for configuration to overlap with normal processing, potentially achieving better system performance than a static(halting) PR implementation. This work has investigated the advantages and trade-offs of DPR as applied to an existing color space conversion(CSC) engine provided by Hewlett-Packard (HP). Two versions were created: a single-pipeline engine, which can only overlap tasks in specific sequences; and a dual-pipeline engine, which can overlap any consecutive tasks. These were implemented in a Virtex-6 FPGA. Data communication occurs over the PCI Express (PCIe) interface. Test results show improvements in execution speed and resource utilization, though some are minor due to intrinsic characteristics of the CSC engine pipeline. The dual-pipeline version outperformed the single-pipeline in most test cases. Therefore, future work will focus on multiple-pipeline architectures
    corecore