602 research outputs found

    Design techniques for xilinx virtex FPGA configuration memory scrubbers

    Get PDF
    SRAM-based FPGAs are in-field reconfigurable an unlimited number of times. This characteristic, together with their high performance and high logic density, proves to be very convenient for a number of ground and space level applications. One drawback of this technology is that it is susceptible to ionizing radiation, and this sensitivity increases with technology scaling. This is a first order concern for applications in harsh radiation environments, and starts to be a concern for high reliability ground applications. Several techniques exist for coping with radiation effects at user application. In order to be effective they need to be complemented with configuration memory scrubbing, which allows error mitigation and prevents failures due to error accumulation. Depending on the radiation environment and on the system dependability requirements, the configuration scrubber design can become more or less complex. This paper classifies and presents current and novel design methodologies and architectures for SRAM-based FPGAs, and in particular for Xilinx Virtex-4QV/5QV, configuration memory scrubbers

    FPGA-based Image Analysis System for Cotton Classing

    Get PDF
    The design and implementation of an FPGA (field-programmable gate array) based image analysis system was undertaken to replace an older system whose components have become obsolete. Video from an analog camera is digitized by a video decoder. The data from the video decoder is stored in memory and then processed using an FPGA. The results are then transmitted over a universal serial bus (USB) to a host personal computer for additional processing. The system also controls the timing of a flash to correctly capture the images; it measures color and reflectance and is used to classify the quality of raw cotton by determining the concentration of impurities (e.g. leaves or trash). The original system is first described and the need for upgrading presented. The goals of the new system are then specified and its implementation presented along with the design space tradeoffs that were considered. Finally, the results obtained from using the new system are presented to demonstrate its effectiveness

    Sustainable Fault-handling Of Reconfigurable Logic Using Throughput-driven Assessment

    Get PDF
    A sustainable Evolvable Hardware (EH) system is developed for SRAM-based reconfigurable Field Programmable Gate Arrays (FPGAs) using outlier detection and group testing-based assessment principles. The fault diagnosis methods presented herein leverage throughput-driven, relative fitness assessment to maintain resource viability autonomously. Group testing-based techniques are developed for adaptive input-driven fault isolation in FPGAs, without the need for exhaustive testing or coding-based evaluation. The techniques maintain the device operational, and when possible generate validated outputs throughout the repair process. Adaptive fault isolation methods based on discrepancy-enabled pair-wise comparisons are developed. By observing the discrepancy characteristics of multiple Concurrent Error Detection (CED) configurations, a method for robust detection of faults is developed based on pairwise parallel evaluation using Discrepancy Mirror logic. The results from the analytical FPGA model are demonstrated via a self-healing, self-organizing evolvable hardware system. Reconfigurability of the SRAM-based FPGA is leveraged to identify logic resource faults which are successively excluded by group testing using alternate device configurations. This simplifies the system architect\u27s role to definition of functionality using a high-level Hardware Description Language (HDL) and system-level performance versus availability operating point. System availability, throughput, and mean time to isolate faults are monitored and maintained using an Observer-Controller model. Results are demonstrated using a Data Encryption Standard (DES) core that occupies approximately 305 FPGA slices on a Xilinx Virtex-II Pro FPGA. With a single simulated stuck-at-fault, the system identifies a completely validated replacement configuration within three to five positive tests. The approach demonstrates a readily-implemented yet robust organic hardware application framework featuring a high degree of autonomous self-control

    Fault Tolerant Electronic System Design

    Get PDF
    Due to technology scaling, which means reduced transistor size, higher density, lower voltage and more aggressive clock frequency, VLSI devices may become more sensitive against soft errors. Especially for those devices used in safety- and mission-critical applications, dependability and reliability are becoming increasingly important constraints during the development of system on/around them. Other phenomena (e.g., aging and wear-out effects) also have negative impacts on reliability of modern circuits. Recent researches show that even at sea level, radiation particles can still induce soft errors in electronic systems. On one hand, processor-based system are commonly used in a wide variety of applications, including safety-critical and high availability missions, e.g., in the automotive, biomedical and aerospace domains. In these fields, an error may produce catastrophic consequences. Thus, dependability is a primary target that must be achieved taking into account tight constraints in terms of cost, performance, power and time to market. With standards and regulations (e.g., ISO-26262, DO-254, IEC-61508) clearly specify the targets to be achieved and the methods to prove their achievement, techniques working at system level are particularly attracting. On the other hand, Field Programmable Gate Array (FPGA) devices are becoming more and more attractive, also in safety- and mission-critical applications due to the high performance, low power consumption and the flexibility for reconfiguration they provide. Two types of FPGAs are commonly used, based on their configuration memory cell technology, i.e., SRAM-based and Flash-based FPGA. For SRAM-based FPGAs, the SRAM cells of the configuration memory highly susceptible to radiation induced effects which can leads to system failure; and for Flash-based FPGAs, even though their non-volatile configuration memory cells are almost immune to Single Event Upsets induced by energetic particles, the floating gate switches and the logic cells in the configuration tiles can still suffer from Single Event Effects when hit by an highly charged particle. So analysis and mitigation techniques for Single Event Effects on FPGAs are becoming increasingly important in the design flow especially when reliability is one of the main requirements

    DYRE: a DYnamic REconfigurable solution to increase GPGPU's reliability

    Get PDF
    General-purpose graphics processing units (GPGPUs) are extensively used in high-performance computing. However, it is well known that these devices’ reliability may be limited by the rising of faults at the hardware level. This work introduces a flexible solution to detect and mitigate permanent faults affecting the execution units in these parallel devices. The proposed solution is based on adding some spare modules to perform two in-field operations: detecting and mitigating faults. The solution takes advantage of the regularity of the execution units in the device to avoid significant design changes and reduce the overhead. The proposed solution was evaluated in terms of reliability improvement and area, performance, and power overhead costs. For this purpose, we resorted to a micro-architectural open-source GPGPU model (FlexGripPlus). Experimental results show that the proposed solution can extend the reliability by up to 57%, with overhead costs lower than 2% and 8% in area and power, respectively

    Innovative Techniques for Testing and Diagnosing SoCs

    Get PDF
    We rely upon the continued functioning of many electronic devices for our everyday welfare, usually embedding integrated circuits that are becoming even cheaper and smaller with improved features. Nowadays, microelectronics can integrate a working computer with CPU, memories, and even GPUs on a single die, namely System-On-Chip (SoC). SoCs are also employed on automotive safety-critical applications, but need to be tested thoroughly to comply with reliability standards, in particular the ISO26262 functional safety for road vehicles. The goal of this PhD. thesis is to improve SoC reliability by proposing innovative techniques for testing and diagnosing its internal modules: CPUs, memories, peripherals, and GPUs. The proposed approaches in the sequence appearing in this thesis are described as follows: 1. Embedded Memory Diagnosis: Memories are dense and complex circuits which are susceptible to design and manufacturing errors. Hence, it is important to understand the fault occurrence in the memory array. In practice, the logical and physical array representation differs due to an optimized design which adds enhancements to the device, namely scrambling. This part proposes an accurate memory diagnosis by showing the efforts of a software tool able to analyze test results, unscramble the memory array, map failing syndromes to cell locations, elaborate cumulative analysis, and elaborate a final fault model hypothesis. Several SRAM memory failing syndromes were analyzed as case studies gathered on an industrial automotive 32-bit SoC developed by STMicroelectronics. The tool displayed defects virtually, and results were confirmed by real photos taken from a microscope. 2. Functional Test Pattern Generation: The key for a successful test is the pattern applied to the device. They can be structural or functional; the former usually benefits from embedded test modules targeting manufacturing errors and is only effective before shipping the component to the client. The latter, on the other hand, can be applied during mission minimally impacting on performance but is penalized due to high generation time. However, functional test patterns may benefit for having different goals in functional mission mode. Part III of this PhD thesis proposes three different functional test pattern generation methods for CPU cores embedded in SoCs, targeting different test purposes, described as follows: a. Functional Stress Patterns: Are suitable for optimizing functional stress during I Operational-life Tests and Burn-in Screening for an optimal device reliability characterization b. Functional Power Hungry Patterns: Are suitable for determining functional peak power for strictly limiting the power of structural patterns during manufacturing tests, thus reducing premature device over-kill while delivering high test coverage c. Software-Based Self-Test Patterns: Combines the potentiality of structural patterns with functional ones, allowing its execution periodically during mission. In addition, an external hardware communicating with a devised SBST was proposed. It helps increasing in 3% the fault coverage by testing critical Hardly Functionally Testable Faults not covered by conventional SBST patterns. An automatic functional test pattern generation exploiting an evolutionary algorithm maximizing metrics related to stress, power, and fault coverage was employed in the above-mentioned approaches to quickly generate the desired patterns. The approaches were evaluated on two industrial cases developed by STMicroelectronics; 8051-based and a 32-bit Power Architecture SoCs. Results show that generation time was reduced upto 75% in comparison to older methodologies while increasing significantly the desired metrics. 3. Fault Injection in GPGPU: Fault injection mechanisms in semiconductor devices are suitable for generating structural patterns, testing and activating mitigation techniques, and validating robust hardware and software applications. GPGPUs are known for fast parallel computation used in high performance computing and advanced driver assistance where reliability is the key point. Moreover, GPGPU manufacturers do not provide design description code due to content secrecy. Therefore, commercial fault injectors using the GPGPU model is unfeasible, making radiation tests the only resource available, but are costly. In the last part of this thesis, we propose a software implemented fault injector able to inject bit-flip in memory elements of a real GPGPU. It exploits a software debugger tool and combines the C-CUDA grammar to wisely determine fault spots and apply bit-flip operations in program variables. The goal is to validate robust parallel algorithms by studying fault propagation or activating redundancy mechanisms they possibly embed. The effectiveness of the tool was evaluated on two robust applications: redundant parallel matrix multiplication and floating point Fast Fourier Transform

    FPGA ARCHITECTURE AND VERIFICATION OF BUILT IN SELF-TEST (BIST) FOR 32-BIT ADDER/SUBTRACTER USING DE0-NANO FPGA AND ANALOG DISCOVERY 2 HARDWARE

    Get PDF
    The integrated circuit (IC) is an integral part of everyday modern technology, and its application is very attractive to hardware and software design engineers because of its versatility, integration, power consumption, cost, and board area reduction. IC is available in various types such as Field Programming Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), System on Chip (SoC) architecture, Digital Signal Processing (DSP), microcontrollers (μC), and many more. With technology demand focused on faster, low power consumption, efficient IC application, design engineers are facing tremendous challenges in developing and testing integrated circuits that guaranty functionality, high fault coverage, and reliability as the transistor technology is shrinking to the point where manufacturing defects of ICs are affecting yield which associates with the increased cost of the part. The competitive IC market is pressuring manufactures of ICs to develop and market IC in a relatively quick turnaround which in return requires design and verification engineers to develop an integrated self-test structure that would ensure fault-free and the quality product is delivered on the market. 70-80% of IC design is spent on verification and testing to ensure high quality and reliability for the enduser. To test complex and sophisticated IC designs, the verification engineers must produce laborious and costly test fixtures which affect the cost of the part on the competitive market. To avoid increasing the part cost due to yield and test time to the end-user and to keep up with the competitive market many IC design engineers are deviating from complex external test fixture approach and are focusing on integrating Built-in Self-Test (BIST) or Design for Test (DFT) techniques onto IC’s which would reduce time to market but still guarantee high coverage for the product. Understanding the BIST, the architecture, as well as the application of IC, must be understood before developing IC. The architecture of FPGA is elaborated in this paper followed by several BIST techniques and applications of those BIST relative to FPGA, SoC, analog to digital (ADC), or digital to analog converters (DAC) that are integrated on IC. Paper is concluded with verification of BIST for the 32-bit adder/subtracter designed in Quartus II software using the Analog Discovery 2 module as stimulus and DE0-NANO FPGA board for verification

    An autonomous and intelligent system for rotating machinery diagnostics

    Get PDF
    Rotating machinery diagnostics (RMD) is a process of evaluating the condition of their components by acquiring a number of measurements and extracting condition related information using signal processing algorithms. A reliable RMD system is fundamental for condition based maintenance programmes to reduce maintenance cost and risk. It must be able to detect any abnormalities at early stages to allow preventing severe performance degradation, avoid economic losses and/or catastrophic failures. A conventional RMD system consists of sensing elements (transducers) and data acquisition system with a compliant software package. Such system is bulky and costly in practical deployment. The recent advancement in micro-scaled electronics have enabled wide spectrum of system design and capabilities at embedded scale. Micro electromechanical system (MEMS) based sensing technologies offer significant savings in terms of system’s price and size. Microcontroller units with embedded computation and sensing interface have enabled system-on-chip design of RMD system within a single sensing node. This research aims at exploiting this growth of microelectronics science to develop a remote and intelligent system to aid maintenance procedures. System’s operation is independent from central processing platform or operator’s analysis. Features include on-board time domain based statistical parameters calculations, frequency domain analysis techniques and a time controlled monitoring tasks within the limitations of its energy budget. A working prototype is developed to test the concept of the research. Two experimental testbeds are used to validate the performance of developed system: DC motor with rotor unbalance and 1.1kW induction motor with phase imbalance. By establishing a classification model with several training samples, the developed system achieved an accuracy of 93% in detecting quantified seeded faults while consumes minimum power at 16.8mW. The performance of developed system demonstrates its strong potential for full industry deployment and compliance

    Development and certification of mixed-criticality embedded systems based on probabilistic timing analysis

    Get PDF
    An increasing variety of emerging systems relentlessly replaces or augments the functionality of mechanical subsystems with embedded electronics. For quantity, complexity, and use, the safety of such subsystems is an increasingly important matter. Accordingly, those systems are subject to safety certification to demonstrate system's safety by rigorous development processes and hardware/software constraints. The massive augment in embedded processors' complexity renders the arduous certification task significantly harder to achieve. The focus of this thesis is to address the certification challenges in multicore architectures: despite their potential to integrate several applications on a single platform, their inherent complexity imperils their timing predictability and certification. Recently, the Measurement-Based Probabilistic Timing Analysis (MBPTA) technique emerged as an alternative to deal with hardware/software complexity. The innovation that MBPTA brings about is, however, a major step from current certification procedures and standards. The particular contributions of this Thesis include: (i) the definition of certification arguments for mixed-criticality integration upon multicore processors. In particular we propose a set of safety mechanisms and procedures as required to comply with functional safety standards. For timing predictability, (ii) we present a quantitative approach to assess the likelihood of execution-time exceedance events with respect to the risk reduction requirements on safety standards. To this end, we build upon the MBPTA approach and we present the design of a safety-related source of randomization (SoR), that plays a key role in the platform-level randomization needed by MBPTA. And (iii) we evaluate current certification guidance with respect to emerging high performance design trends like caches. Overall, this Thesis pushes the certification limits in the use of multicore and MBPTA technology in Critical Real-Time Embedded Systems (CRTES) and paves the way towards their adoption in industry.Una creciente variedad de sistemas emergentes reemplazan o aumentan la funcionalidad de subsistemas mecánicos con componentes electrónicos embebidos. El aumento en la cantidad y complejidad de dichos subsistemas electrónicos así como su cometido, hacen de su seguridad una cuestión de creciente importancia. Tanto es así que la comercialización de estos sistemas críticos está sujeta a rigurosos procesos de certificación donde se garantiza la seguridad del sistema mediante estrictas restricciones en el proceso de desarrollo y diseño de su hardware y software. Esta tesis trata de abordar los nuevos retos y dificultades dadas por la introducción de procesadores multi-núcleo en dichos sistemas críticos: aunque su mayor rendimiento despierta el interés de la industria para integrar múltiples aplicaciones en una sola plataforma, suponen una mayor complejidad. Su arquitectura desafía su análisis temporal mediante los métodos tradicionales y, asimismo, su certificación es cada vez más compleja y costosa. Con el fin de lidiar con estas limitaciones, recientemente se ha desarrollado una novedosa técnica de análisis temporal probabilístico basado en medidas (MBPTA). La innovación de esta técnica, sin embargo, supone un gran cambio cultural respecto a los estándares y procedimientos tradicionales de certificación. En esta línea, las contribuciones de esta tesis están agrupadas en tres ejes principales: (i) definición de argumentos de seguridad para la certificación de aplicaciones de criticidad-mixta sobre plataformas multi-núcleo. Se definen, en particular, mecanismos de seguridad, técnicas de diagnóstico y reacción de faltas acorde con el estándar IEC 61508 sobre una arquitectura multi-núcleo de referencia. Respecto al análisis temporal, (ii) presentamos la cuantificación de la probabilidad de exceder un límite temporal y su relación con los requisitos de reducción de riesgos derivados de los estándares de seguridad funcional. Con este fin, nos basamos en la técnica MBPTA y presentamos el diseño de una fuente de números aleatorios segura; un componente clave para conseguir las propiedades aleatorias requeridas por MBPTA a nivel de plataforma. Por último, (iii) extrapolamos las guías actuales para la certificación de arquitecturas multi-núcleo a una solución comercial de 8 núcleos y las evaluamos con respecto a las tendencias emergentes de diseño de alto rendimiento (caches). Con estas contribuciones, esta tesis trata de abordar los retos que el uso de procesadores multi-núcleo y MBPTA implican en el proceso de certificación de sistemas críticos de tiempo real y facilita, de esta forma, su adopción por la industria.Postprint (published version
    • …
    corecore