Reliability Measurement of Mass Storage System for Onboard Instrumentation by Choi, Minsu et al.
Missouri University of Science and Technology 
Scholars' Mine 
Electrical and Computer Engineering Faculty 
Research & Creative Works Electrical and Computer Engineering 
01 Dec 2005 
Reliability Measurement of Mass Storage System for Onboard 
Instrumentation 
Minsu Choi 




Follow this and additional works at: https://scholarsmine.mst.edu/ele_comeng_facwork 
 Part of the Electrical and Computer Engineering Commons 
Recommended Citation 
M. Choi et al., "Reliability Measurement of Mass Storage System for Onboard Instrumentation," IEEE 
Transactions on Instrumentation and Measurement, vol. 54, no. 6, pp. 2297-2304, Institute of Electrical 
and Electronics Engineers (IEEE), Dec 2005. 
The definitive version is available at https://doi.org/10.1109/TIM.2005.858514 
This Article - Journal is brought to you for free and open access by Scholars' Mine. It has been accepted for 
inclusion in Electrical and Computer Engineering Faculty Research & Creative Works by an authorized administrator 
of Scholars' Mine. This work is protected by U. S. Copyright Law. Unauthorized use including reproduction for 
redistribution requires the permission of the copyright holder. For more information, please contact 
scholarsmine@mst.edu. 
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 54, NO. 6, DECEMBER 2005 2297
Reliability Measurement of Mass Storage System for
Onboard Instrumentation
Minsu Choi, Member, IEEE, Nohpill Park, Member, IEEE, Vincenzo Piuri, Fellow, IEEE, and
Fabrizio Lombardi, Member, IEEE
Abstract—Advances in spaceborne vehicular technology have
made possible the long-life duration of the mission in harsh cosmic
environments. Reliability and data integrity are the commonly
emphasized requirements of spaceborne solid-state mass storage
systems, because faults due to the harsh cosmic environments,
such as extreme radiation, can be experienced throughout the
mission. Acceptable dependability for these instruments has
been achieved by using redundancy and repair. Reconfiguration
(repair) of memory arrays using spare memory lines is the most
common technique for reliability enhancement of memories with
faults. Faulty cells in memory arrays are known to show spatial
locality. This physical phenomenon is referred to as fault clus-
tering. This paper initially investigates a quadrat-based fault model
for memory arrays under clustered faults to establish a reliable
foundation of measurement. Then, lifelong dependability of a
fault-tolerant spaceborne memory system with hierarchical active
redundancy, which consists of spare columns in each memory
module and redundant memory modules, is measured in terms of
the reliability (i.e., the conditional probability that the system per-
forms correctly throughout the mission) and mean-time-to-failure
(i.e., the expected time that a system will operate before it fails).
Finally, minimal column redundancy search technique for the
fault-tolerant memory system is proposed and verified through a
series of parametric simulations. Thereby, design and fabrication
of cost-effective and highly reliable fault-tolerant onboard mass
storage system can be realized for dependable instrumentation.
Index Terms—Clustered faults, hierarchical active redundancy,
mean-time-to-failure (MTTF), memory reconfiguration (repair),
onboard mass storage system, quadrat-based fault model, redun-
dancy minimization, reliability.
I. INTRODUCTION
ADVANCES in spaceborne vehicular technology havemade possible the long-life duration of missions in
harsh cosmic environments [10]–[12]. Reliability and data in-
tegrity are commonly emphasized requirements of spaceborne
solid-state mass storage systems, because faults due to the
harsh cosmic environments, such as extreme radiation, can
be experienced throughout the mission [10]–[12]. Acceptable
dependability for these solid-state mass storage instruments
has been commonly achieved by using redundancy and repair.
Manuscript received July 16, 2003; revised May 31, 2005.
M. Choi is with the Department Electrical and Computer Engineering, Uni-
versity of Missouri-Rolla, Rolla, MO 65409 USA (e-mail: choim@umr.edu).
N. Park is with the Department of Computer Science, Oklahoma State Uni-
versity, Stillwater, OK 74078-1053 USA (e-mail: npark@a.cs.okstate.edu).
V. Piuri is with the Department of Information Technologies, University of
Milan, 26013 Crema, Italy (e-mail: piuri@dti.unimi.it).
F. Lombardi is with the Department of Electrical and Computer En-
gineering, Northeastern University, Boston, MA 02115 USA (e-mail:
lombardi@ece.neu.edu).
Digital Object Identifier 10.1109/TIM.2005.858514
Reconfiguration (repair) of memory arrays using spare memory
lines is the most common technique for reliability enhancement
of memories with faults [1], [2], [4]–[7].
Mainly, three different fault mechanisms can occur in cosmic
environment: single event upsets (SEUs), total ionizing dose
(TID), and displacement damage (DD) [8], [9]. SEUs cause
transition faults (i.e., temporary faults) in random locations.
Normally, error detection and correction codes resolve SEU
issues. However, TID, mostly due to electrons and protons, can
result in device degradation and failure. Also, DD is caused
due to cumulative long-term nonionizing damage from protons,
electrons and neutrons and can result in lattice defects; the
collision between an incoming particle and a lattice atom sub-
sequently displaces the atom from its original lattice position.
Interestingly, defects due to DD tend to form clusters [9]. Also,
continuous and cumulative TID accelerate this clustering of
defects. Thus, random fault models are usually exploited in
modeling SEUs, while cluster fault models are more suitable in
modeling permanent faults due to TID and DD.
To accurately model the faulty memory arrays, a proper
fault model must be introduced. It is well known that defects
in VLSI circuits tend to occur in clusters due to defects that
span multiple circuit elements [4], [5], [13], [15]. This physical
phenomenon is referred to as defect clustering. Attempts to
deal with defect clustering have focused mainly on the models
involving compounded Poisson distributions [4], [5], [15]. In
these models, the wafer is divided into multiple regions and
the distribution of the defects within each region is assumed to
follow the Poisson distribution. Models that use compounded
distributions are quadrat-based because they assume different
distributions in different regions (quadrats) of the wafer. For
quadrat-based models, defects occur s-dependently in the same
quadrat, while occurrences of defects in different quadrats are
s-independent (i.e., statistically independent) [4]. An alternative
approach to modeling defects is the center-satellite approach
[13] wherein there are separate distributions describing the
locations of cluster centers and the locations of defects within
a cluster. While some aspects of the center-satellite approach
make it quite well-suited for modeling defect clustering, the re-
sulting models can have more parameters, making the problem
of parameter estimation more complex than in quadrat-based
models [4]. The quadrat-based fault clustering model makes it
possible to accurately measure the reliability of memory arrays
with clustered faults without excessive complexity.
Solid-state mass memories for spaceborne applications may
experience excessive interference and damage due to harsh
cosmic environments [10]–[12]. In [10], error correcting codes
are used to cope with particle-induced bit errors—an extended
0018-9456/$20.00 © 2005 IEEE
2298 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 54, NO. 6, DECEMBER 2005
Reed-Solomon code circumvents soft errors while maintaining
a low-hardware implementation of coding and checking. The
memory system proposed in [11] is also designed for space-
borne applications and uses two levels of active redundancy,
where faulty columns in a memory module are repaired by
spare columns and malfunctioned modules are replaced by
module redundancy. Circuits for radiation-hardened memories
are also introduced in [12], where an orthogonal shuffle-type
of write-read arrays, error correction through weighted bidi-
rectional codes, and associative iterative repair circuits are
exploited to harden memories against interference and damage
due to excessive radiation.
The objective of this paper is to thoroughly measure the de-
pendability of a fault-tolerant onboard memory system under
fault clustering, and achieve more accurate prediction of relia-
bility (i.e., the conditional probability that the system performs
correctly throughout a time interval) and mean-time-to-failure
(MTTF, i.e., the expected time that a system will operate before
it fails). Thereby, the more dependable mission-specific onboard
mass storage systems can be realized with respect to reliability
and MTTF while maintaining minimal redundancy.
The organization of this paper is as follows. In Section II,
review and preliminaries related to this research work will be
given. In Section III, a fault-tolerant memory system with two
levels of active redundancy proposed in [11] is reviewed. Re-
liability measurements for nonfault-tolerant and fault-tolerant
memory modules with a clustered fault model are discussed in
Sections IV and V, respectively. Reliability measurement of a
fault-tolerant onboard memory system consisting of such fault-
tolerant memory modules is investigated in Section VI. Para-
metric simulation and its results are given in Section VIII. Dis-
cussion and conclusions are in the final section.
II. REVIEW AND PRELIMINARIES OF CLUSTERED FAULT MODEL
In this Section, the clustered memory fault model proposed by
Blough, et al. [4], [5], will be briefly reviewed. The following
notations are used throughout this work.
• : number of rows and columns in an array, excluding
spares.
• : a memory array with rows and columns.
• : number of spare columns.
• : an element of .
• : number of rows and columns in a quadrat.
• : .
• : conditional probability of the given
condition.
• : (see below).
• : fault arrival rate (i.e., the conditional probability of
becoming a faulty cell in a unit time period) of a memory
cell within FP (i.e., fault-prone) quadrat.
• : fault arrival rate of a memory cell within FR (i.e.,
fault-resistant) quadrat (see below).
• : parameter of a Poisson random variable.
• : number of fault-free memory modules required for
the system to be functional.
• : number of total memory modules including spare
modules.
• : (i.e., number of redundant modules).
• : time.
• : reliability, the conditional probability that the system
performs correctly throughout .
The following assumptions are made in this work.
1) Quadrats can be one of two types: fault prone quadrat
(denoted by FP) and fault resistant quadrat (denoted by
FR). FPs are prone to have faults while FRs resists faults.
2) Within any quadrat, faults occur s-independently.
3) A quadrat is FP with probability , s-independently of
other quadrats.
4) Occurrence of faults in FP quadrats is determined by .
5) Occurrence of faults in FR quadrats is determined by .
6) and .
7) and .
8) is an integer.
For memory arrays, some of the most challenging problems
are the achievement of acceptable reliability and the minimiza-
tion of redundancy area (overhead) [1], [2], [4], [5], [14]. If
more redundancy area is provided, a highly reliable memory re-
configuration is possible, but more cost due to the additional
redundancy overhead is unavoidable. Thus, an appropriate bal-
ance of acceptable reliability and redundancy area is desirable
for high-reliability, low-cost manufacturing of memory arrays
for space applications.
In this paper, a memory array, , is partitioned
into quadrats containing cells. Assumptions given
above determine the occurrence patterns of faults within such
an array. Assumption (1) defines what it means for a quadrat
to be FP or FR: The array is divided into quadrats with a
high density of faulty cells and quadrats with a low density
of faulty cells. The a priori probability of a cell being faulty
is . However, if some of the neighbors of a
cell are known to be faulty, the probability of that cell’s being
faulty increases toward since it is more likely that the cell
lies in a FP quadrat. Fig. 1 illustrates the effect of the Clus-
tered-Fault model. Fig. 1(a) uses the quadrat-based fault model
with , , , and
to generate the faulty memory array. Fig. 1(b) uses a random
fault model with which was
chosen to make the expected number of faults the same in both
illustrations.
A faulty column (row) containing more than one faulty cell
is referred to as connective faulty column (CFC) (CFR) [2].
The memory array in Fig. 1(a) has eight CFCs and the one in
Fig. 1(b) has 14 CFCs. Importance of the fault-clustering is
shown in Fig. 2, wherein both of the faulty memory arrays in
Fig. 1 are repaired by spare columns. The memory array given
in Fig. 2(a) requires eight spare lines while the memory given
in Fig. 2(b) needs 14 spare lines.
The average number of faulty cells covered by one spare line
is referred to as covering ratio. For example, covering ratio of
the reconfiguration given in Fig. 2(a) is 2.25 while covering ratio
of the other reconfiguration given in Fig. 2(b) is 1.2857. It is
quite intuitive that fewer spare lines would be required to repair
a given memory array if faulty cells show spatial locality (i.e.,
fault clustering).
CHOI et al.: RELIABILITY MEASUREMENT OF MASS STORAGE SYSTEM FOR ONBOARD INSTRUMENTATION 2299
Fig. 1. Fault arrays generated with (a) clustered-fault model and (b) random-fault model [4], [5].
Fig. 2. Repaired fault arrays generated with (a) clustered-fault model and (b) random-fault model.
III. FAULT-TOLERANT ONBOARD MEMORY ARCHITECTURE
In this section, a fault-tolerant memory system with two levels
of active redundancy proposed in [11] is reviewed. A total of
spare memory modules are provided in the system
and each spare module can substitute for any of the pri-
mary memory modules. Consequently, the memory system can
tolerate up to complete memory module failures before the
memory system becomes inoperable. Each module has
spare columns to repair faulty memory columns, if any. If a
memory cell fails, then the column containing that cell is elim-
inated from the system and replaced with a spare column. If
the memory module runs out of spare columns, then the en-
tire module is replaced with a spare module. This type of re-
dundancy is often referred to as two-level redundancy: The first
level being the spare columns and the second level being the
spare memory modules. Both forms of reconfiguration are ac-
tive techniques, and they require that the fault be detected, lo-
cated, and successfully removed from the system. Fig. 3 shows
the architecture of the fault-tolerant memory system for space-
borne applications. A detailed description of the architecture can
be found in [11]. In the following sections, the reliability of the
given fault-tolerant onboard memory system under fault clus-
tering will be modeled and measured.
IV. RELIABILITY MEASUREMENT OF NONFAULT-TOLERANT
MEMORY MODULE WITH CLUSTERED FAULTS MODEL
The given memory module has quadrats. For
example, memory module shown in Fig. 1 can be parameterized
as , and . A quadrat is FP with probability
, s-independently of other quadrats.
and , formally. Fault arrival rate in
a FP quadrat is and fault arrival rate in a
FR quadrat is . Then, reliability of a FP is
determined by the exponential failure law
and reliability of a FR is . Each column
of quadrats in the given memory array is referred to as
2300 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 54, NO. 6, DECEMBER 2005
quadrat-column where each quadrat-column has quadrats.
Since and ,
the expected number of FP quadrats in a quadrat-column is
and the expected number of FR quadrats in a quadrat-column
is . Therefore, the reliability of a quadrat-column is
(1)
Finally, overall reliability of a nonfault-tolerant memory
module becomes
(2)
V. RELIABILITY MEASUREMENT OF FAULT-TOLERANT
MEMORY MODULE WITH CLUSTERED FAULT MODEL
The given memory array consists of approximately
FP quadrats and FR quadrats. For each FP quadrat,
is
(3)
Furthermore, the expected number of faulty columns in a FP
quadrat becomes
(4)
Since , , and , the FR quadrats
are essentially fault-free and the FP quadrats primarily dictate
the locations of the faulty cells. Hence, the expected number of
faulty columns in a quadrat-column is primarily determined by
the faulty columns in FP quadrats. Thus, the expected number
of faulty columns in a quadrat-column is
(5)
and the overall expected number of faulty columns in a memory
module becomes
(6)
Hence, the failure rate of a single column can be estimated as
(7)
and the reliability of a single column can be expressed as
(8)
Each memory module consists of memory columns and
spare memory columns and a quorum of out of the total of
columns are required to function for the memory module
to function. Thus, the reliability of the fault-tolerant memory
module with spare columns can be written as
(9)
Fig. 3. Architecture of fault-tolerant onboard memory system.
(10)
VI. RELIABILITY MEASUREMENT OF THE FAULT-TOLERANT
ONBOARD MEMORY SYSTEM
The fault-tolerant onboard memory system consists of
fault-tolerant memory modules and of these memory mod-
ules must be functional, where column failures of each module
are repaired by spare columns and failed modules are replaced
by spare modules. The system uses two levels of hierarchical
active redundancy. Therefore, reliability of the given fault-tol-
erant memory system can be expressed as
(11)
In addition to the reliability, the mean time to failure MTTF
is a useful measurement to specify the quality of a system since
the MTTF is the expected time that a system will operate before
the first failure occurs. Therefore, it can be used to measure the
system operation life without failure. The MTTF is defined in
terms of the reliability function as
(12)
which is valid for any reliability function that satisfies
.
VII. REDUNDANCY OPTIMIZATION OF ONBOARD MEMORY
SYSTEM UNDER FAULT CLUSTRING
As shown in Fig. 3, the onboard memory system overcomes
column-wise permanent faults and module-wise perma-
nent faults by exploiting two-level hierarchical active redun-
dancy technique. Embedding balanced amount of redundancy
CHOI et al.: RELIABILITY MEASUREMENT OF MASS STORAGE SYSTEM FOR ONBOARD INSTRUMENTATION 2301
TABLE I
SUMMARY OF SIMULATION PARAMETERS
Fig. 4. Reliability of individual memory module.
makes possible to achieve acceptable reliability of the memory
subsystem while minimizing the cost due to the redundancy.
Finding the minimum amount of redundancy which guarantees
desired reliability throughout the required system life time is
referred to as “redundancy optimization.” Let be the
system life time and be the minimum reliability which
must be guaranteed at the end of the ; therefore, the fol-
lowing inequality must hold:
(13)
Thus, the following equation can be solved with respect to
to find the minimum number of spares required to achieve the
given constraints (i.e., and ):
(14)
Since must be an integer value, spares must
be embedded.
VIII. PARAMETRIC SIMULATION
The effect of the fault clustering and redundancy on the re-
liability of the fault-tolerant memory system will be studied
through numerical experiments in this section. Parameters used
in this simulation are summarized in Table I. The unit time in-
terval is a week.
In Fig. 4, the reliability enhancement ability of spare column
redundancy in a memory module is visualized. of the non-
fault-tolerant memory module calculated by the (2)
along with of the fault-tolerant memory module with different
numbers of spare columns from 4 to 32 calculated by the (9)
are shown. Note that and
, which means that reli-
ability of the nonfault-tolerant memory module becomes less
than 0.95 in two weeks. Typical requirements of a long-life ap-
plication are to have a 0.95, or greater, the probability of being
Fig. 5. Reliability of memory system without module redundancy.
Fig. 6. Reliability of memory system with two redundant modules.
operational at the end of a ten-year period. Thus, the nonfault-
tolerant memory module must be hardened to be operational
for longer mission time. In the case of 32 (i.e., 32 spare
columns are provided),
and where it success-
fully endures more than 10 years of mission time while main-
taining at 25% (i.e., ) redun-
dancy overhead.
In Figs. 5–8, the reliability enhancement ability of module
redundancy in a memory system is investigated. Each figure
has five plots varying the number of spare columns from 0 to
32. For example, and means that the memory
system consists of total modules, including
spare modules, and each module has 32 spare columns. For
the memory system with no redundancy (i.e., and
), , which means it would
not maintain for even a week. If 32 spare columns
are applied, the memory system will maintain
2302 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 54, NO. 6, DECEMBER 2005
Fig. 7. Reliability of memory system with four redundant modules.
Fig. 8. Reliability of memory system with six redundant modules.
for about 9 years since
and . The reliability of
the memory system can be further enhanced by exploiting
module redundancy. The memory system with and
satisfies the requirement of for more
than 12 years because and
. To meet for ten
years, the minimal overhead of these systems is with
and .
Another useful measurement of the fault-tolerant onboard
memory system quality is mean-time-to-failure MTTF , which
is the expected time that a system will operate before a failure
occurs. Fig. 9 shows MTTFs of three different memory module
configurations of 32 32, 64 64, and 128 128 with
different numbers of spare columns from 0 to 32. For example,
for 32 32 memory module without
spare columns, which implies that the expected time that the
memory module will operate before a failure occurs is only
about 25 weeks while the same memory module with 32 spare
columns exceeds MTTF weeks.
MTTFs of the 256 K nonfault-tolerant (i.e., and )
and fault-tolerant memory systems with various combinations
Fig. 9. MTTFs of fault-tolerant memory modules.
Fig. 10. MTTFs of fault-tolerant 256 K memory system.
of and (i.e., from 1 to 32 for and from 2 to 8 for ) are
given in Fig. 10. MTTF of the nonfault-tolerant memory system,
2.091 197 weeks, can be extended to 10 643.743 596 weeks by
use of the two-level active redundancy of and
. Thus, the results shown in Figs. 9 and 10 confirm the fault-
tolerant ability of the two-level active redundancy technique.
Finally, the number of spare columns can be optimized by
the proposed redundancy optimization technique. Optimized
redundancy guarantees desired reliability during the required
life time of the system at minimal redundancy overhead. In
Fig. 11, and 0, 2, 4, 8 and 100,
200, 300, 400, 500, 600, 700 are applied to the (14) to search
s. For example, to maintain reliability more than 0.90
for 700 weeks, must be 36. Fig. 12 also shows results
for .
As shown in this section, intelligent exploitation of the pro-
posed measurement estimation technique makes possible the
design and manufacture of balanced onboard memory systems
satisfying reliability and mission duration requirements while
maintaining minimal cost due to redundancy overhead.
CHOI et al.: RELIABILITY MEASUREMENT OF MASS STORAGE SYSTEM FOR ONBOARD INSTRUMENTATION 2303
Fig. 11. Redundancy optimization results for R = 0:90.
Fig. 12. Redundancy optimization results for R = 0:95.
IX. DISCUSSION AND CONCLUSION
As advances in spaceborne vehicular technology make
possible the long-life duration of missions in harsh cosmic
environments, reliability and data integrity become commonly
emphasized requirements of spaceborne solid-state mass
storage systems, because faults due to the harsh cosmic en-
vironments—such as extreme radiation—can be experienced
throughout the mission. In addition, it is well known that
faults show spatial locality on VLSI circuits. Thus, a reliability
measurement and estimation technique for the fault-tolerant
onboard memory system under fault clustering has been pro-
posed and validated throughout the parametric simulation in
this paper. Thereby, intelligent exploitation of the proposed
measurement and estimation technique makes possible the
design and manufacture of balanced onboard memory systems
satisfying the reliability and mission duration requirements
while maintaining minimal redundancy. For example, according
to the simulation results given in this paper, the sample 256 K
fault-tolerant onboard memory system with 32 spare columns
in each memory module and 6 redundant modules has higher
probability than 0.95 of being operational after 10 year period
and its mean-time-to-failure MTTF exceeds 232 years.
REFERENCES
[1] C. P. Low and H. W. Leong, “A new class of efficient algorithms for
reconfiguration of memory arrays,” IEEE Trans. Comput., vol. 45, no.
5, pp. 614–618, May 1996.
[2] N. Park and F. Lombardi, “Repair of memory arrays by cutting,” in
Proc. Memory Technology, Design and Testing, Proceedings Interna-
tional Workshop, Aug. 1998, pp. 124–130.
[3] K. Arndt and C. Narayan et al., “Reliability of laser activated metal
fuses in DRAMs,” in Proc. 24th IEEE/CPMT Electronics Manufac-
turing Technology Symp., Oct. 1999, pp. 389–394.
[4] D. M. Blough, “Performance evaluation of a reconfiguration-algorithm
for memory arrays containing clustered faults,” IEEE Trans. Rel., vol.
45, no. 2, pp. 274–284, Jun. 1996.
[5] D. M. Blough and A. Pelc, “A clustered failure model for the memory
array reconfiguration problem,” IEEE Trans. Comput., vol. 42, no. 5, pp.
518–528, May 1993.
[6] Y. Jeon, Y. Jun, and S. Kim, “Column redundancy scheme for multiple
I/O DRAM using mapping table,” Electron. Lett., vol. 36, no. 11, May
2000.
[7] R. J. McPartland and D. J. Loeper et al., “SRAM embedded memory
with low cost, FLASH EEPROM-switch-controlled redundancy,” in
Proc. IEEE Custom Integrated Circuits Conf., vol. 36, May 2000, pp.
287–289.
[8] E. J. Daly, A. Hilgers, G. Drolshagen, and H. D. R. Evans, “Environment
analysis: experience and trends,” presented at the ESA Symp. Environ-
ment Modeling for Space-Based Applications, Sep. 1996.
[9] NASA Jet Propulsion Lab., Pasadena, CA. Space Radiation Ef-
fects on Microelectronics (Online Document). [Online]. Available:
http://parts.jpl.nasa.gov/docs/Radcrs_Final.pdf
[10] T. Fichna, M. Gartner, F. Gliem, and F. Rombeck, “Fault-tolerance of
spaceborne semiconductor mass memories,” in Proc. 28th Annu. Int.
Symp. Fault-Tolerant Computing, 1998, pp. 408–413.
[11] K. A. Clark and B. W. Johnson, “A fault-tolerant solid-state memory for
spaceborne applications,” in Proc. Government Microelectronics Appli-
cations Conf., November 1992, pp. 441–444.
[12] T. P. Haraszti, R. P. Mento, N. E. Moyer, and W. M. Grant, “Novel cir-
cuits for radiatation hardened memories,” IEEE Trans. Nucl. Sci., vol.
39, no. 5, pp. 1341–1351, Oct. 1992.
[13] F. J. Meyer and D. K. Pradhan, “Modeling defect spatial distribution,”
IEEE Trans. Comput., vol. 38, no. 4, pp. 538–546, Apr. 1989.
[14] S. Y. Kuo and W. K. Fuchs, “Efficient spare allocation in reconfigurable
arrays,” IEEE Design and Test, pp. 24–31, Feb. 1987.
[15] C. H. Stapper, “On yield, fault distributions, and clustering of particles,”
IBM J. Res. Develop., vol. 30, no. 3, pp. 326–338, May 1986.
Minsu Choi (M’02) received the B.S., M.S., and
Ph.D. degrees in computer science from Oklahoma
State University, Stillwater, in 1995, 1998, and 2002
respectively.
He is currently with Department of Electrical
and Computer Engineering, University of Missouri,
Rolla, as an Assistant Professor. His research mainly
focuses on computer architecture and VLSI, em-
bedded systems, fault tolerance, testing, quality
assurance, reliability modeling and analysis, config-
urable computing, parallel and distributed systems,
dependable instrumentation and measurement, and autonomic computing.
Dr. Choi is a member of Sigma Xi and Golden Key National Honor Society.
He was a recipient of the Don and Sheley Fisher Scholarship, in 2000, the Ko-
rean Consulate Honor Scholarship in 2001, and the Graduate Research Excel-
lence Award in 2002.
2304 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 54, NO. 6, DECEMBER 2005
Nohpill Park (M’99) received the B.S. degree and
the M.S. degree in computer science from Seoul Na-
tional University, Seoul, Korea, in 1987 and 1989, re-
spectively, and the Ph.D. degree from the Department
of Computer Science, Texas A&M University, Col-
lege Station, in 1997.
He is currently an Assistant Professor in the Com-
puter Science Department at Oklahoma State Univer-
sity (OSU), Stillwater. His research interests include
digital systems design for reliability and reliable dig-
ital instrumentation and measurement with emphasis
on modeling, analysis, and optimization techniques.
Dr. Park received the OSU Outstanding Young Investigator Award from
Sigma Xi in 2003. He has been involved, as a technical program committee
member, in many international symposia, conferences, and workshops spon-
sored by professional organizations, and he has also actively served as a referee
for many archival journals.
Vincenzo Piuri (F’01) received the Ph.D. degree
in computer engineering from the Politecnico di
Milano, Milano, Italy, in 1989.
From 1992 to September 2000, he was an
Associate Professor of operating systems at the
Politecnico di Milano. Since October 2000, he has
been a Full Professor of Computer Engineering
at the University of Milano. He was a Visiting
Professor with the University of Texas, Austin, from
the summers of 1993 to 1999. His research interests
include distributed and parallel computing systems,
computer arithmetic, application-specific processing architectures, digital
signal processing architectures, fault tolerance, neural network architectures,
theory and industrial applications of neural techniques for identification,
prediction, control, and signal and image processing. His original results
have been published in more than 150 papers in book chapters, international
journals, and proceedings of international conferences.
Dr. Piuri is a member of the ACM, INNS, and AEI. He is an Associate Editor
of the IEEE TRANSACTIONS ON NEURAL NETWORKS and the Journal of Systems
Architecture. He is Vice President of Publications for the IEEE Instrumentation
and Measurement Society, Vice President of Members Activities of the IEEE
Neural Networks Society, and Member of the Administrative Committee of both
of the IEEE Instrumentation and Measurement Society and the IEEE Neural
Network Society.
Fabrizio Lombardi (M’82) received the B.Sc. de-
gree (Hons.) in electronic engineering from the Uni-
versity of Essex, Essex, U.K., in 1977, the M.S. de-
gree in microwaves and modern optics from the Mi-
crowave Research Unit, University College London,
London, U.K., in 1978, as well as the Diploma in mi-
crowave engineering in 1978, and the Ph.D. degree
from the University of London in 1982.
He is currently the Chairperson of the Department
of Electrical and Computer Engineering and holder
of the International Test Conference (ITC) Endowed
Professorship at Northeastern University, Boston, MA. He was a faculty
member at Texas Technical University, Lubbock, the University of Colorado,
Boulder, and Texas A&M University, College Station. He received the Visiting
Fellowship at the British Columbia Advanced System Institute, University of
Victoria, Victoria, BC, Canada, in 1988, the TEES Research Fellowship from
1991 to 1992 and again from 1997 to 1998, and the Halliburton Professorship
in 1995. He has been involved in organizing many international symposia,
conferences, and workshops sponsored by organizations such as NATO and the
IEEE, as well as Guest Editor in archival journals and magazines. His research
interests are fault tolerant computing, testing and design of digital systems,
configurable computing, defect tolerance, and CAD VLSI. He has extensively
published in these area and has edited six books.
Dr. Lombardi received the International Research Award from the Ministry
of Science and Education of Japan for the period from 1993 to 1999, the
1985/1986 Research Initiation Award from the IEEE/Engineering Foundation,
a Silver Quill Award from Motorola in 1996, and a Distinguished Visitor of the
IEEE Computer Society for the period from 1990 to 1993. He was an Associate
Editor of the IEEE TRANSACTIONS ON COMPUTERS from 1996 to 2000 and
currently, he is the Associate Editor-in-Chief of the IEEE TRANSACTIONS ON
COMPUTERS.
