3 research outputs found
FAULT LOCATION IN A SEMICONDUCTOR RANDOM-ACCESS MEMORY UNIT
A semiconductor random-accessor memory unit (RAM unit) is a connection
of RAM chips, Data Cable, Chip Select Cable, and Address Cable so that each
storage element can be selected for writing or reading independent of previous
write or read. The faulty RAM unit is represented by a model consisting of
four types of faults: RAM chip faults, D-fault, CS-fault, and A-fault. The
testing of a RAM unit and locating faults to the RAM chips or wires in the
various cables is considered. A set of six tests has been designed to diagnose
the faults in the model. The tests are used in defining a relation, "test tj is
invalid". on the fault model. The diagnostic graph of a RAM unit is drawn by
using this relatioon and a sequence in which tests have to be performed
obtained from this graph. By using this sequence faulty components in a RAM
unit are located when at most one type of fault in the model is present.
The symmetric array organization of storage elements in RAM chips is
used in developing the tests with minimal length. Test generation is using the
operations increment, decrement, compare and rotate, and quite easy to program
RAMpage: Graceful Degradation Management for Memory Errors in Commodity Linux Servers
Abstract—Memory errors are a major source of reliability problems in current computers. Undetected errors may result in program termination, or, even worse, silent data corruption. Recent studies have shown that the frequency of permanent memory errors is an order of magnitude higher than previously assumed and regularly affects everyday operation. Often, neither additional circuitry to support hardware-based error detection nor downtime for performing hardware tests can be afforded. In the case of permanent memory errors, a system faces two challenges: detecting errors as early as possible and handling them while avoiding system downtime. To increase system reliability, we have developed RAMpage, an online memory testing infrastructure for commodity x86-64-based Linux servers, which is capable of efficiently detecting memory errors and which provides graceful degradation by withdrawing affected memory pages from further use. We describe the design and implementation of RAMpage and present results of an extensive qualitative as well as quantitative evaluation. Keywords-Fault tolerance, DRAM chips, Operating systems I
An Optimal Algorithm for Detecting Pattern Sensitive Faults in Semiconductor Random Access Memories
Random-access memory (RAM) testing to detect unrestricted pattern-sensitive faults (PSFs) is impractical due to the size of the memory checking sequence required. A formal model for restricted PSFs in RAMs called adjacent-pattern interference faults (APIFs) is presented. A test algorithm capable of detecting APIFs in RAMs requiring a minimum number of memory operations is then developed