13 research outputs found

    Single Event Effects Assessment of UltraScale+ MPSoC Systems under Atmospheric Radiation

    Get PDF
    The AMD UltraScale+ XCZU9EG device is a Multi-Processor System-on-Chip (MPSoC) with embedded Programmable Logic (PL) that excels in many Edge (e.g., automotive or avionics) and Cloud (e.g., data centres) terrestrial applications. However, it incorporates a large amount of SRAM cells, making the device vulnerable to Neutron-induced Single Event Upsets (NSEUs) or otherwise soft errors. Semiconductor vendors incorporate soft error mitigation mechanisms to recover memory upsets (i.e., faults) before they propagate to the application output and become an error. But how effective are the MPSoC's mitigation schemes? Can they effectively recover upsets in high altitude or large scale applications under different workloads? This article answers the above research questions through a solid study that entails accelerated neutron radiation testing and dependability analysis. We test the device on a broad range of workloads, like multi-threaded software used for pose estimation and weather prediction or a software/hardware (SW/HW) co-design image classification application running on the AMD Deep Learning Processing Unit (DPU). Assuming a one-node MPSoC system in New York City (NYC) at 40k feet, all tested software applications achieve a Mean Time To Failure (MTTF) greater than 148 months, which shows that upsets are effectively recovered in the processing system of the MPSoC. However, the SW/HW co-design (i.e., DPU) in the same one-node system at 40k feet has an MTTF = 4 months due to the high failure rate of its PL accelerator, which emphasises that some MPSoC workloads may require additional NSEU mitigation schemes. Nevertheless, we show that the MTTF of the DPU can increase to 87 months without any overhead if one disregards the failure rate of tolerable errors since they do not affect the correctness of the classification output.Comment: This manuscript is under review at IEEE Transactions on Reliabilit

    Fast Tuning of the PID Controller in An HVAC System Using the Big Bang–Big Crunch Algorithm and FPGA Technology

    No full text
    This article presents a novel technique for the fast tuning of the parameters of the proportional–integral–derivative (PID) controller of a second-order heat, ventilation, and air conditioning (HVAC) system. The HVAC systems vary greatly in size, control functions and the amount of consumed energy. The optimal design and power efficiency of an HVAC system depend on how fast the integrated controller, e.g., PID controller, is adapted in the changes of the environmental conditions. In this paper, to achieve high tuning speed, we rely on a fast convergence evolution algorithm, called Big Bang–Big Crunch (BB–BC). The BB–BC algorithm is implemented, along with the PID controller, in an FPGA device, in order to further accelerate of the optimization process. The FPGA-in-the-loop (FIL) technique is used to connect the FPGA board (i.e., the PID and BB–BC subsystems) with the plant (i.e., MATLAB/Simulink models of HVAC) in order to emulate and evaluate the entire system. The experimental results demonstrate the efficiency of the proposed technique in terms of optimization accuracy and convergence speed compared with other optimization approaches for the tuning of the PID parameters: sw implementation of the BB–BC, genetic algorithm (GA), and particle swarm optimization (PSO)

    Exploiting Thread-Level Parallelism in Functional Self-Testing of CMT Processors

    No full text
    Major microprocessor vendors have integrated functional software-based self-testing in their manufacturing test flows during the last decade. Functional self-testing is performed by test programs that the processor executes at-speed from on-chip memory. Multiprocessors and multithreaded architectures are constantly becoming the typical general-purpose computing paradigm, and thus the various existing uniprocessor functional self-testing schemes must be adopted and adjusted to meet the testing requirements of complex multiprocessors. A major challenge in porting a functional self-testing approach from the uniprocessor to the multiprocessor case is to take advantage of the inherent execution parallelism offered by the multiple cores and the multiple threads in order to reduce test execution time. In this paper, we study the application of functional self-testing to chip multithreaded (CMT) processors. We propose a method that exploits thread-level parallelism (TLP) to speed up the execution of self-test routines in every physical core of a multiprocessor chip. The proposed method effectively splits the self-test routines into shorter ones, assigns the new routines to the hardware threads of the core and schedules their execution in order to minimize the core idle intervals due to cache misses or long latency operations and maximize the utilization of core computing resources. We demonstrate our method in the open-source CMT multiprocessor model, Sun’s OpenSPARC T1, which contains eight CPU cores, each one supporting four hardware threads. Our experimental results show a self-test execution speedup of more than three times compared to the single thread execution

    Systematic Software-Based Self-Test for Pipelined Processors

    No full text
    Software-based self-test (SBST) has recently emerged as an effective methodology for the manufacturing test of processors and other components in Systems-on-Chip (SoCs). By moving test related functions from external resources to the SoC’s interior, in the form of test programs that the on-chip processor executes, SBST eliminates the need for high-cost testers, and enables high-quality at-speed testing. Thus far, SBST approaches have focused almost exclusively on the functional (directly programmer visible) components of the processor. In this paper, we analyze the challenges involved in testing an important component of modern processors, namely, the pipelining logic, and propose a systematic SBST methodology to address them. We first demonstrate that SBST programs that only target the functional components of the processor are insufficient to test the pipeline logic, resulting in a significant loss of fault coverage. We further identify the testability hotspots in the pipeline logic. Finally, we develop a systematic SBST methodology that enhances existing SBST programs to comprehensively test the pipeline logic. The proposed methodology is complementary to previous SBST techniques that target functional components (their results can form the input to our methodology), and can reuse the test development effort behind existing SBST programs. We applied the methodology to two complex, fully pipelined processors. Results show that our methodology provides fault coverage improvements of up to 15% (12 % on average) for the entire processor, and fault coverage improvements of 22 % for the pipeline logic, compared to a conventional SBST approach

    Systematic software-based self-test for pipelined processors

    No full text
    Software-based self-test (SBST) has recently emerged as an effective methodology for the manufacturing test of processors and other components in Systems-on-Chip (SoCs). By moving test related functions from external resources to the SoC’s interior, in the form of test programs that the on-chip processor executes, SBST eliminates the need for high-cost testers, and enables high-quality at-speed testing. Thus far, SBST approaches have focused almost exclusively on the functional (directly programmer visible) components of the processor. In this paper, we analyze the challenges involved in testing an important component of modern processors, namely, the pipelining logic, and propose a systematic SBST methodology to address them. We first demonstrate that SBST programs that only target the functional components of the processor are insufficient to test the pipeline logic, resulting in a significant loss of fault coverage. We further identify the testability hotspots in the pipeline logic. Finally, we develop a systematic SBST methodology that enhances existing SBST programs to comprehensively test the pipeline logic. The proposed methodology is complementary to previous SBST techniques that target functional components (their results can form the input to our methodology), and can reuse the test development effort behind existing SBST programs. We applied the methodology to two complex, fully pipelined processors. Results show that our methodology provides fault coverage improvements of up to 15% (12% on average) for the entire processor, and fault coverage improvements of 22% for the pipeline logic, compared to a conventional SBST approach

    Analyzing the Resilience to SEUs of an Image Data Compression Core in a COTS SRAM FPGA

    No full text
    In this paper, we evaluate the error resilience of an image data compression IP core, an FPGA-based accelerator of the CCSDS 121.0-B-2 algorithm used to compress the ESA PROBA-3 ASPIICS Coronagraph System Payload image data. We have enhanced a fault injection platform previously proposed for the SEU evaluation of FPGA soft processor cores to interface with the target image data compression IP core and calculate the required for failure analysis image quality metrics. Through an extensive fault injection campaign, we analyze the vulnerability of the image compression core against Single Event Upsets (SEU) in a SRAM FPGA configuration memory. The soft errors are classified and evaluated depending on their effects in the operation of the compression core and the quality of the reconstructed images based on the structural similarity index metric (SSIM). The experimental fault injection results demonstrate error resiliency inherent to the image compression algorithm implementation that can be exploited to tradeoff an acceptable lossless compression performance degradation or a negligible effect on compression fidelity for significant savings in FPGA resource utilization (23% LUTs and 17% FFs) using a selective protection of the compression core modules
    corecore