3 research outputs found
A framework for system dependability validation under the influence of intrinsic parameters fluctuation
This paper presents a framework to analyze and evaluate effects of cell failures induced by impact of intrinsic parameters fluctuation (IPF) on system dependability. The method of evaluation is based on generating the actual cell failures model and the realistic conditions of hardware-software interactions, where the actual error pattern can be captured. The case study of this paper is the impact of cell failures in L1 data cache of a general-purpose microprocessor. The failure modules are generated corresponding to the individual and combined impact of IPF sources in nanometer scale Ultra Thin Body – Silicon on Isolator (UTB-SOI) transistor on 6T-SRAM cell stability. A novel fault injection mechanism has been introduced to propagate errors, through modifying data of cache transactions according to error(s) incurred, dynamically at system-level. By applying a representative system workload using a well-selected suit of real benchmark programs, this study demonstrates that the framework: 1) provides an accurate user visible description for the implications of cell failures at the higher levels of abstraction induced by IPF sources at the lower levels of abstraction, 2) links individual and combined impact of IPF sources with the corresponding implications at system-level which offers a tool to systems designer to involve IPF impacts within the design plan, 3) allows for a detailed simulation process of a system-level environment in the presence of cell failures induced by IPF within an accepted period of time using the look-up file technique and thus offers a foundation to system dependability studies that require vast statistical models, 4) offers high credible evaluation results because the framework is based on the actual error pattern incurred in the system, and 5) improves system reliability where it offers valuable perceptions for an optimal fault tolerance technique in L1 cache with a high failures rate
Impact of intrinsic parameter fluctuation on the fault tolerance of L1 data cache
As the semiconductor process technology continues to scale deeper into the nanometer region, the intrinsic parameter fluctuations will aggressively affect the performance and reliability of future microprocessors and System-on-Chip (SoC) applications. These system requires large SRAM arrays that occupy an increasing fraction of the chip real estate. To investigate the impact various source of intrinsic parameter fluctuation (IPF) from systems point of view, a framework to bridge architecture-level and device-level simulation will be utilized for data cache built from transistors with 25 nm, 18 nm and 13 nm technology node. This study found that the IPF will not have any significant impacts on data cache memory systems build with 25 nm while increasing the memory cell ratio, (ß) to two will overcome the IPF impacts for the 18 nm. However, the 13 nm technology data cache could not operate even with higher cell ratio. Common, cache memory fault detection and correction such as ECC and redundancy can only partially remove the transaction error caused by these fluctuation sources
Fault tolerance of L1 data cache memory induced by intrinsic parameters fluctuation in sub 10nm UTB-SOI MOSFETs
Currently, the development of models at higher level of abstractions (system-level) to be able to incorporate effects at lower levels of abstractions (process /transistor) is in demand. This thesis addresses issues to enabling computer system simulation model in the presence of cell failures in L1 data cache corresponding to the impact of Intrinsic Parameters Fluctuation (IPF). These time-independent transistor-level sources of variation are randomly characterized in nature. This makes it difficult for the designer to include IPF impact in the design plan to overcome. This computer model is vital to analyze and evaluate credibly the effectiveness of L1 cache fault tolerance techniques in controlling the implications of IPF cell failures on microprocessor reliability and yield. The objectives of this thesis are (i) to devise a framework to simulate system-level environment in the presence of L1 data cache cell failures corresponding to the impact of IPF, (ii) to introduce an evaluation method for deduce the effectiveness of L1 cache fault tolerance techniques in handling the actual error pattern caused by IPF cell failures in computer system under test and workload conditions, and (iii) to investigate the implications of L1 data cache faults induced by the individual and combined impact of IPF sources on reliability of a general-purpose microprocessor. The case study of this thesis is the impact of cell failures in the data array of L1 data cache in Intel Strong ARM@SA-1110 microprocessor. The failure models are generated corresponding to the individual and combined impact of Random Discrete Dopants in the source/drain regions (RDD), Line Edge Roughness (LER) and Body Thickness Variation (BTV) as the main sources of IPF in next nanometre-scale Ultra-Thin Body Silicon-on Isolator (UTB-SOI) transistor generations on Six-Transistors Static Random Access memory (6T SRAM) cell stability. The L1 cache fault tolerance techniques evaluated are hardware redundancy, parity check,Hamming single error correction double error detection (SECDED), and Hamming triple error detection (TED). It was found that the rate of read faulty cells will rapidly increase in 6T SRAM cache with continued scaling of UTB-SOI device beyond 10 nm gate length. L1 cache conventional fault tolerance techniques, i.e. hardware redundancy, parity check, and SECDED, might be able to hold the implications of IPF cell failures in L1 data cache based 7.5 nm and 5 nm UTB-SOI device, particularly when 6T SRAM is designed with cell ratio of two. However, the effectiveness of these techniques was found to be sensitive to the existence of any faulty word in cache. Hence, their immunity against any transient fault that might occur during system operation will significantly degrade. Experimental results showed that in L1 data cache based on 5 nm UTBSOI device, hybrid hardware redundancy with TED would achieve 68.2 percent of microprocessor chip yield in applications tolerate 10 percent performance loss bound. This indicates that employing these techniques in industry will assist to keep 6T SRAM cache scalability even with the increasing impact of IPF