441 research outputs found
Agent Based Test and Repair of Distributed Systems
This article demonstrates how to use intelligent agents for testing and repairing a distributed system, whose elements may or may not have embedded BIST (Built-In Self-Test) and BISR (Built-In Self-Repair) facilities. Agents are software modules that perform monitoring, diagnosis and repair of the faults. They form together a society whose members communicate, set goals and solve tasks. An experimental solution is presented, and future developments of the proposed approach are explore
Flash-memories in Space Applications: Trends and Challenges
Nowadays space applications are provided with a processing power absolutely overcoming the one available just a few years ago. Typical mission-critical space system applications include also the issue of solid-state recorder(s). Flash-memories are nonvolatile, shock-resistant and power-economic, but in turn have different drawbacks. A solid-state recorder for space applications should satisfy many different constraints especially because of the issues related to radiations: proper countermeasures are needed, together with EDAC and testing techniques in order to improve the dependability of the whole system. Different and quite often contrasting dimensions need to be explored during the design of a flash-memory based solid- state recorder. In particular, we shall explore the most important flash-memory design dimensions and trade-offs to tackle during the design of flash-based hard disks for space application
Infrastructures and Algorithms for Testable and Dependable Systems-on-a-Chip
Every new node of semiconductor technologies provides further miniaturization and higher performances, increasing the number of advanced functions that electronic products can offer. Silicon area is now so cheap that industries can integrate in a single chip usually referred to as System-on-Chip (SoC), all the components and functions that historically were placed on a hardware board. Although adding such advanced functionality can benefit users, the manufacturing process is becoming finer and denser, making chips more susceptible to defects. Todayâs very deep-submicron semiconductor technologies (0.13 micron and below) have reached susceptibility levels that put conventional semiconductor manufacturing at an impasse. Being able to rapidly develop, manufacture, test, diagnose and verify such complex new chips and products is crucial for the continued success of our economy at-large. This trend is expected to continue at least for the next ten years making possible the design and production of 100 million transistor chips.
To speed up the research, the National Technology Roadmap for Semiconductors identified in 1997 a number of major hurdles to be overcome. Some of these hurdles are related to test and dependability.
Test is one of the most critical tasks in the semiconductor production process where Integrated Circuits (ICs) are tested several times starting from the wafer probing to the end of production test. Test is not only necessary to assure fault free devices but it also plays a key role in analyzing defects in the manufacturing process. This last point has high relevance since increasing time-to-market pressure on semiconductor fabrication often forces foundries to start volume production on a given semiconductor technology node before reaching the defect densities, and hence yield levels, traditionally obtained at that stage. The feedback derived from test is the only way to analyze and isolate many of the defects in todayâs processes and to increase processâs yield.
With the increasing need of high quality electronic products, at each new physical assembly level, such as board and system assembly, test is used for debugging, diagnosing and repairing the sub-assemblies in their new environment. Similarly, the increasing reliability, availability and serviceability requirements, lead the users of high-end products performing periodic tests in the field throughout the full life cycle.
To allow advancements in each one of the above scaling trends, fundamental changes are expected to emerge in different Integrated Circuits (ICs) realization disciplines such as IC design, packaging and silicon process. These changes have a direct impact on test methods, tools and equipment. Conventional test equipment and methodologies will be inadequate to assure high quality levels. On chip specialized block dedicated to test, usually referred to as Infrastructure IP (Intellectual Property), need to be developed and included in the new complex designs to assure that new chips will be adequately tested, diagnosed, measured, debugged and even sometimes repaired.
In this thesis, some of the scaling trends in designing new complex SoCs will be analyzed one at a time, observing their implications on test and identifying the key hurdles/challenges to be addressed. The goal of the remaining of the thesis is the presentation of possible solutions. It is not sufficient to address just one of the challenges; all must be met at the same time to fulfill the market requirements
Innovative Techniques for Testing and Diagnosing SoCs
We rely upon the continued functioning of many electronic devices for our everyday welfare,
usually embedding integrated circuits that are becoming even cheaper and smaller
with improved features. Nowadays, microelectronics can integrate a working computer
with CPU, memories, and even GPUs on a single die, namely System-On-Chip (SoC).
SoCs are also employed on automotive safety-critical applications, but need to be tested
thoroughly to comply with reliability standards, in particular the ISO26262 functional
safety for road vehicles.
The goal of this PhD. thesis is to improve SoC reliability by proposing innovative
techniques for testing and diagnosing its internal modules: CPUs, memories, peripherals,
and GPUs. The proposed approaches in the sequence appearing in this thesis are described
as follows:
1. Embedded Memory Diagnosis: Memories are dense and complex circuits which
are susceptible to design and manufacturing errors. Hence, it is important to understand
the fault occurrence in the memory array. In practice, the logical and physical
array representation differs due to an optimized design which adds enhancements to
the device, namely scrambling. This part proposes an accurate memory diagnosis
by showing the efforts of a software tool able to analyze test results, unscramble
the memory array, map failing syndromes to cell locations, elaborate cumulative
analysis, and elaborate a final fault model hypothesis. Several SRAM memory failing
syndromes were analyzed as case studies gathered on an industrial automotive
32-bit SoC developed by STMicroelectronics. The tool displayed defects virtually,
and results were confirmed by real photos taken from a microscope.
2. Functional Test Pattern Generation: The key for a successful test is the pattern applied
to the device. They can be structural or functional; the former usually benefits
from embedded test modules targeting manufacturing errors and is only effective
before shipping the component to the client. The latter, on the other hand, can be
applied during mission minimally impacting on performance but is penalized due
to high generation time. However, functional test patterns may benefit for having
different goals in functional mission mode. Part III of this PhD thesis proposes
three different functional test pattern generation methods for CPU cores embedded
in SoCs, targeting different test purposes, described as follows:
a. Functional Stress Patterns: Are suitable for optimizing functional stress during
I
Operational-life Tests and Burn-in Screening for an optimal device reliability
characterization
b. Functional Power Hungry Patterns: Are suitable for determining functional
peak power for strictly limiting the power of structural patterns during manufacturing
tests, thus reducing premature device over-kill while delivering high test
coverage
c. Software-Based Self-Test Patterns: Combines the potentiality of structural patterns
with functional ones, allowing its execution periodically during mission.
In addition, an external hardware communicating with a devised SBST was proposed.
It helps increasing in 3% the fault coverage by testing critical Hardly
Functionally Testable Faults not covered by conventional SBST patterns.
An automatic functional test pattern generation exploiting an evolutionary algorithm
maximizing metrics related to stress, power, and fault coverage was employed
in the above-mentioned approaches to quickly generate the desired patterns. The
approaches were evaluated on two industrial cases developed by STMicroelectronics;
8051-based and a 32-bit Power Architecture SoCs. Results show that generation
time was reduced upto 75% in comparison to older methodologies while
increasing significantly the desired metrics.
3. Fault Injection in GPGPU: Fault injection mechanisms in semiconductor devices
are suitable for generating structural patterns, testing and activating mitigation techniques,
and validating robust hardware and software applications. GPGPUs are
known for fast parallel computation used in high performance computing and advanced
driver assistance where reliability is the key point. Moreover, GPGPU manufacturers
do not provide design description code due to content secrecy. Therefore,
commercial fault injectors using the GPGPU model is unfeasible, making radiation
tests the only resource available, but are costly. In the last part of this thesis, we
propose a software implemented fault injector able to inject bit-flip in memory elements
of a real GPGPU. It exploits a software debugger tool and combines the
C-CUDA grammar to wisely determine fault spots and apply bit-flip operations in
program variables. The goal is to validate robust parallel algorithms by studying
fault propagation or activating redundancy mechanisms they possibly embed. The
effectiveness of the tool was evaluated on two robust applications: redundant parallel
matrix multiplication and floating point Fast Fourier Transform
EDACs and test integration strategies for NAND flash memories
Mission-critical applications usually presents several critical issues: the required level of dependability of the whole mission always implies to address different and contrasting dimensions and to evaluate the tradeoffs among them. A mass-memory device is always needed in all mission-critical applications: NAND flash-memories could be used for this goal. Error Detection And Correction (EDAC) techniques are needed to improve dependability of flash-memory devices. However also testing strategies need to be explored in order to provide highly dependable systems. Integrating these two main aspects results in providing a fault-tolerant mass-memory device, but no systematic approach has so far been proposed to consider them as a whole. As a consequence a novel strategy integrating a particular code-based design environment with newly selected testing strategies is presented in this pape
Memory built-in self-repair and correction for improving yield: a review
Nanometer memories are highly prone to defects due to dense structure, necessitating memory built-in self-repair as a must-have feature to improve yield. Todayâs system-on-chips contain memories occupying an area as high as 90% of the chip area. Shrinking technology uses stricter design rules for memories, making them more prone to manufacturing defects. Further, using 3D-stacked memories makes the system vulnerable to newer defects such as those coming from through-silicon-vias (TSV) and micro bumps. The increased memory size is also resulting in an increase in soft errors during system operation. Multiple memory repair techniques based on redundancy and correction codes have been presented to recover from such defects and prevent system failures. This paper reviews recently published memory repair methodologies, including various built-in self-repair (BISR) architectures, repair analysis algorithms, in-system repair, and soft repair handling using error correcting codes (ECC). It provides a classification of these techniques based on method and usage. Finally, it reviews evaluation methods used to determine the effectiveness of the repair algorithms. The paper aims to present a survey of these methodologies and prepare a platform for developing repair methods for upcoming-generation memories
What is the Path to Fast Fault Simulation?
Motivated by the recent advances in fast fault simulation techniques for large combinational circuits, a panel discussion has been organized for the 1988 International Test Conference. This paper is a collective account of the position statements offered by the panelists
Design and Verification of a Dual Port RAM Using UVM Methodology
Data-intensive applications such as Deep Learning, Big Data, and Computer Vision have resulted in more demand for on-chip memory storage. Hence, state of the art Systems on Chips (SOCs) have a memory that occupies somewhere between 50% to 90 % of the die space. Extensive Research is being done in the field of memory technology to improve the efficiency of memory packaging. This effort has not always been successful because densely packed memory structures can experience defects during the fabrication process. Thus, it is critical to test the embedded memory modules once they are taped out. Along with testing, functional verification of a module makes sure that the design works the way it has been intended to perform. This paper proposes a built-in self-test (BIST) to validate a Dual Port Static RAM module and a complete layered test bench to verify the moduleâs operation functionally. The BIST has been designed using a finite state machine and has been targeted against most of the general SRAM faults in a given linear time constraint of O(23n). The layered test bench has been designed using Universal Verification Methodology (UVM), a standardized class library which has increased the re-usability and automation to the existing design verification language, SystemVerilog
Agent Based Test and Repair of Distributed Systems
This article demonstrates how to use intelligent agents for testing and repairing a distributed system, whose elements may or may not have embedded BIST (Built-In Self-Test) and BISR (Built-In Self-Repair) facilities.
Agents are software modules that perform monitoring, diagnosis and repair of the faults. They form together a society whose members communicate, set goals and solve tasks.
An experimental solution is presented, and future developments of the proposed approach are explored
Design-for-delay-testability techniques for high-speed digital circuits
The importance of delay faults is enhanced by the ever increasing clock rates and decreasing geometry sizes of nowadays' circuits. This thesis focuses on the development of Design-for-Delay-Testability (DfDT) techniques for high-speed circuits and embedded cores. The rising costs of IC testing and in particular the costs of Automatic Test Equipment are major concerns for the semiconductor industry. To reverse the trend of rising testing costs, DfDT is\ud
getting more and more important
- âŠ