I. Introduction
Future and current computer designs require memory and storage subsystems that are dense, large, long-living, energy-efficient, and low-cost. To achieve this goal, new non-volatile memory technologies have been proposed and applied. Two of the most promising technologies are Flash and Phase-Change Memory (PCM) [1] . These technologies have several desirable properties, including non-volatility, low power consumption, and good scalability. However, they have a significant problem associated with limited write endurance. Flash and PCM devices wear-out relatively quickly, which necessitates techniques to prolong device lifetime.
Flash is a mature block-oriented technology that has achieved sufficiently low cost and high density to be competitive as solid-state storage [2] . Flash devices suffer from reliability problems: a cell can be permanently damaged after 10,000 to 100,000 erasures, temporary errors can occur because of a large number of reads to a cell and writes can affect the stored value of adjacent cells. The limit on the number of erase operations is severe, which requires wear-leveling techniques to spread erasures/writes among the cells for good device lifetime. Flash memory manufacturers do not publish the specific cell lifetime distributions. Instead, they may provide only the expected number of erasures that cells can be expected to sustain (10,000 to 100,000). The experimental results from [2] show a correlation between failures and number of erasures. A large body of work has been done to improve lifetime and performance in Flash memories [3] , [4] .
PCM has been proposed for storage and main memory. Indeed, recent arguments that PCM can replace DRAM for main memory are quite compelling [5] , [6] , [7] , [8] , [9] . PCM is projected to have better scalability than DRAM and lower energy consumption. Unlike Flash, it is a bitaddressable technology and has read performance equal to DRAM [10] . However, similar to Flash, PCM suffers from device wear-out. PCM devices may fail after only 10 7 to 10 9 write cycles, which is insufficient for main memory. Because this problem is critical to the use of PCM in main memory, many endurance and failure-handling approaches have been suggested [5] , [7] , [8] , [11] , [6] .
In developing and evaluating lifetime management techniques for both Flash and PCM, different models of endurance have been assumed. In this paper, we develop analysis techniques to study how endurance models affect device lifetime. We study memory lifetime with four common distributions of cell endurance: constant, linear, normal and bimodal. The normal distribution uses a gaussian distribution of the cell endurance, the bimodal distribution assumes that cells are either weak (low lifetime) or strong (high lifetime), the constant distribution assumes all cells have the same endurance, and the linear distribution assumes that cells' endurance varies from a low to a high value. When using these distributions, we assume that a wear-leveling mechanism exists that wears all pages at the same rate 2 . A wear-out prone memory subsystem is usually dimensioned with more capacity than required, either by employing additional devices or by using larger devices. This excess capacity is used to increase memory lifetime. The actual strategies and mechanisms that manage failure in wear-prone memory can be modeled as one of two general types of algorithms. First, in physical capacity degradation (PCD) algorithms, all the physical memory is initially used and, as cells are damaged, the memory size is reduced. Second, physical sparing (PS) replaces damaged cells with operational spare cells from the excess capacity. Note that although wear-leveling is used in both techniques, PCD uses wear-leveling on all non-damaged pages, while PS uses it only on the pages that are currently active and excludes the pages that are still spares.
Naively, PCD seems like it would always increase device lifetime because it spreads writes over the entire physical memory (with wear leveling), which makes the memory live longer. However, we found that, for specific amounts of excess capacity, PS can actually yield a much longer lifetime than PCD. We develop the analysis to understand and evaluate the situations when one endurance management algorithm is preferred over another. The selection of the appropriate endurance algorithm is dependent only on the ratios of excess capacity to the total number of cells, and the number of weak cells to the total number of cells. Moreover, some wear-leveling algorithms, such as stop-gap [6] , are not suitable for PCD since a constant physical memory size is necessary. Our analysis can determine the amount of lifetime lost for PCD and PS under the different endurance models.
The contributions of this paper are:
1) to develop and show how different models of cell endurance distribution affect device lifetime for physical capacity degradation and physical sparing, 2) to identify which endurance technique achieves a higher lifetime according to the specific model and device characteristics, and 3) to propose engineering constraints to achieve the highest lifetime when device parameters can be changed.
II. Analysis of endurance algorithms with process variation
We assume the memory subsystem has M physical pages, each of fixed size. Akin to many storage systems, the set of pages is partitioned in two areas: a visible addressable space of L pages and a reserved excess capacity area of N pages, as seen in Figure 1 . In other words, L = M −N and we assume the memory subsystem fails when there are less than L undamaged physical pages. Note that the addressable user space has the same physical size L independent of the endurance algorithm (PCD or PS) and therefore the performance of user processes is the same for both endurance algorithms.
The N excess capacity pages are used by each endurance algorithm in a different way. PCD uses the excess capacity to distribute the writes among all M pages, while PS uses the excess capacity to replace damaged pages in the addressable space L. Because it is uncommon to reserve more than 50% of the memory as excess capacity, we assume that N< M 2 , that is, M>2N .
Fig. 1. Model of the endurance algorithms
We also assume that there is an ideal wear-leveling scheme that distributes the writes among all pages used (the wear-leveling algorithm is ideal in the sense that all pages have had the same number of writes at any instant of time). To distribute writes among the pages, a mapping is necessary from addressable locations to physical pages, as shown in Figure 1 . This mapping depends on the implementation of the endurance algorithm [5] , [7] , [8] .
The lifetime of a set of M pages will be measured as the number of writes that the memory supports until its capacity is reduced below L. Table I summarizes the key parameters and terminology used in the analysis of the different endurance techniques and distributions.
In each subsection below, we compute the lifetime of a memory for four different models of process variation, namely constant, bimodal, linear and normal, for each of the two endurance algorithms (PCD and PS). We present the models by order of complexity.
A. The Constant Model
In the constant model, we assume that all pages have the same endurance, W D . In the PCD case, all M pages receive the same number of writes due to the underlying wear-leveling algorithm. This implies that the lifetime for a memory with M pages is: 
B. The Bimodal Endurance Model
The bimodal model assumes that pages are divided into two sets: a low endurance set with K pages that has an endurance of W DL per page and a high endurance set with M −K pages that has an endurance of W DH per page. It is assumed that W DL W DH . In the PDC case, the K weak pages are damaged and retired after W DL ·M writes (since the memory starts with M pages). There are two cases to consider:
• K≤N : For N pages to be damaged, some strong pages have to be damaged since only K weak pages exist. This allows an additional (W DH −W DL )·(M −K) writes to be applied to the memory. Thus,
• K>N : After W DL ·M writes, the K weak pages will be damaged and the number of available pages will be less than M −N , thus leading to system failure. Hence,
In the PS case, there are three cases to be analyzed:
• K≤N : When all the K weak pages are in the addressable M −N pages, they will be replaced when they reach their endurance limit (W DL ). The lifetime will be determined by the endurance of the strong pages, W DH , and the size of the addressable space (M −N ). Hence,
• K>2N : When there are at least N +1 weak pages in the addressable space, these weak pages will be damaged after W DL ·(M −N ) writes and the number of spare pages available, N , will not be enough to replace them. Thus,
Fig. 2. Spare and addressable page endurance distribution in a bimodal model.
• N<K≤2N : Among the K weak pages, let i be the number of weak spare pages and K − i be the number of weak used pages, as shown in Figure 2 . The K−i weak used pages located in the addressable space will be damaged first since they have the lowest endurance and are being constantly used. Two cases may occur: -K−i>N : If i<K−N , the number of weak used pages is larger than the number of spare pages. This implies that when the weak used pages are damaged, the memory will fail since there will not be enough spares to replace all the damaged pages. The lifetime is then
In this case, there are at most N weak used pages. These weak used pages will be damaged after W DL ·(M −N ) writes to the addressable space and will be replaced by spares, extending the lifetime of the addressable space by an additional W DL ·(M −N ) writes. Given that K>N , then i>0, and some of the newly commissioned spares will be weak and will be damaged after an additional W DL ·(M −N ) writes. If K−i=N , then no more spares will be available and the memory will fail at this point (here we assume that 2·W DL W DH , that is, the weak spare pages will be damaged before the strong pages). However, if K−i<N , some spares will still be available after the first replacement round and the lifetime will be extended by an additional W DL ·(M −N ) writes with any new replacement round for which spares will still be available.
The lifetime is then LP S(M,N)≥2W DL ·(M −N ),
with the equality achieved when only the first replacement round is possible. Summarizing, the lifetime in the region delimited by N<K≤2N is:
It is important to note that the lifetimes for PCD and PS depend on K N , the ratio of weak pages to spare pages.
Comparing the lifetime of PCD and PS:
PCD has a higher lifetime than PS when (a) the number of weak cells is larger than the number of spares (i.e., when In the case of N<K≤2N , the result depends on Equation (5) since the lifetime of PCD, as clear from Equation (2) , is constant in this region.
The lifetime of the PS algorithm depends on the number of weak pages in the spare area. The probability of having x weak pages among the spares, P r(i = x), follows a hypergeometric distribution. PS will have a higher lifetime than PCD when i≥K−N , which occurs with probability:
The average, N)>LP CD(M,N) only if i>K−N , the probability can be estimated by:
By changing the condition in Equation (7) to (6) can be rewritten as a function of the ratios of K, M , and N , showing that LP S>LP CD more than 50% of the time:
P r LP S(M,N)> LP CD(M,N)

≥0.5 if
Note that Equation (8) In the region above the curve, it is more probable that the lifetime of PS will be higher than the lifetime of PCD (points 1, 2, 3, 5 and 8 in Figure 3 ). The region below the curve will have the opposite behavior, with higher lifetime when using PCD (points 4, 6 and 7 in Figure 3 ).
Fig. 3. Regions where PS increases lifetime
Note that for the hypergeometric distribution the standard deviation, σ, is at most the square root of the average, that is, σ≤ NK M . The hypergeometric distribution can be approximated by a normal distribution for which the probability that i<E[i]−2σ or i>E[i]+2σ is less than 2.5%. This allows Equation (8) to be rewritten as:
P r LP S(M,N)> LP CD(M,N)
≥0.975 if
. Given that σ= NK M , we can conclude that
. Equation (9) indicates that there is a very narrow band (of width proportional to N) ) is between 0.025 and 0.975. Above this band the lifetime of PS is longer than the lifetime of PCD (with very high probability) and below this band, the lifetime of PS is shorter than the lifetime of PCD (with a very high probability).
r(LP S(M,N)>LP CD(M,
To examine the validity of our analysis, we plot in Figure 4 the exact probability given by Equation (6) The impact of larger values of N on lifetime is minimal because it increases J but also decreases the number of pages in the addressable space. Large values of R would in theory benefit PS but the probability that a page needs to be replaced more than once before the memory subsystem dies increases, reducing the number of addressable pages that can be replaced. The end result is a lower lifetime than predicted by the maximum j. Using Monte Carlo simulations, we determined that the lifetime of PCD and PS for the linear endurance model are very similar, with a maximum of 3% difference.
D. The Normal Endurance Model
The normal model can be approximated by a constant model if the standard deviation is small compared to the average. The linear model is a good approximation if the normal model has a large standard deviation and the number of spares is small (we only are interested in the pages with a low lifetime). In cases that the approximations are not applicable, numerical simulations show that PCD and PS present a very similar lifetime under the normal endurance model with PCD always winning by less than 5%.
III. Uses of the lifetime models
In Section II, we showed that system lifetime under a specific endurance model can vary depending on the algorithm, the percentage of spares and percentage of weak pages. The analysis and results above can be used in design tool to obtain the longest lifetime of the memory.
The decision to use PS or PCD depends primarily on the lifetime distribution of the pages. For the constant model, PCD will result in the highest lifetime. In the linear and normal models the difference is small allowing the decision to be taken based on other design constraints.
The bimodal model is less straightforward and two cases should be examined. The first case is when a manufacturer produces a device with size M and wants to sell it with a size L. If In a second case, a manufacturer produces a device of capacity M with K bad cells and by choosing N and the endurance algorithm, can market it as a device of size L=M −N . Using the objective of highest lifetime with the largest addressable space, the selection of the endurance algorithm and of N are coupled. Figure 5 shows how the lifetime changes when N is varied in a system with a fixed M and K. The maximum lifetime is achieved when N is bigger than K, and we can choose N =K to minimize resources, because the lifetime does not change for all values of N that satisfies this relation. It is possible that, for marketing reasons, restrictions will prohibit the use of N ≥K (e.g., when an already advertised device size has to be sold but the devices were produced with too many weak pages). At times, the manufacturer knows M , N , and L but K is variable. This can happen, for example, due to wafer to wafer process variation or fabrication process improvements. The selection of the algorithm to operate the memory will be based on the region the memory subsystem falls in, as described above.
All the previous results assume that the weak pages are distributed randomly. If it is possible to identify the weak pages, then those pages should be used as spares and a PS algorithm should be used. This result is valid for the region 1< K N ≤2. The use of weak pages as spares guarantees that the constraint of Equation (5) is valid, increasing the lifetime of the PS algorithm.
In a constant model, PCD lifetime is independent of the number of spares but PS lifetime actually decreases with a larger number of spares. In this model, reserving space as excess capacity is unnecessary and all memory should be exported to the system. A linear model with a low value of R behaves in a similar fashion to the constant model and the same recommendations apply. A linear model with larger R will have a similar lifetime with either PCD or PS so either can be used. The amount of excess capacity reserved determines the expected memory lifetime, since a larger excess capacity will also increase lifetime while reducing the addressable space.
IV. Experimental Results
We used simulations to validate the analytical model and the accuracy of the approximations. A subset of benchmarks from PARSEC, SPECCPU2006 and SPECjbb2005 were executed in Simics and main memory traces were recorded and used as input to the PCM memory simulator in [11] . The simulated system had M = 8 Million physical pages and followed the bimodal model with K and N set per experiment so specific values of N M and K N are obtained. The endurance for a weak page was 100 times smaller than the endurance for a strong page. We also experimented with the linear and normal models with the highest endurance varying from 10 to 100 times the smallest endurance. The wear-leveling algorithm was based on the one in [11] . Figure 6 shows the measured lifetime, normalized to the lifetime for PCD for point 1, for each of the algorithms. The smaller PCD lifetime for points 4, 5, 6, 7 and 8, is caused by the specific wearleveling algorithm [11] used being non-ideal. Larger K N for the same N M creates more weak pages and increases the probability that a page with a higher number of writes resides in a weak page, hence reducing the memory lifetime. The analysis of Section II-B predicts that points 1, 2, 3, 5 and 8 have a higher lifetime with PS and points 4, 6 and 7 have a higher lifetime with PCD. The results from Figure 6 agree with the model. The gain in lifetime for points 1, 2, 5 and 8 agrees with the gain predicted in Section II-B for PS, with one replacement round. The higher result of point 3 is caused by the system being able to replace damaged pages twice, that is, in two replacement rounds. These results also validate the approximation of Equation (9) .
For the normal and linear models, we did find that the lifetime was 3 to 5% higher for the PCD algorithm when compared to the PS.
V. Related Work
PCM main memory has been proposed in [5] , [6] , [8] . In [6] , a start-gap wear-leveling mechanism is proposed for wear-leveling. The conclusions of Section II-B would apply to this mechanism since it follows the assumptions used in Section II. This wear-leveling mechanism has a very low cost but does not allow, in its present form, for the removal of damaged pages from the addressable space. The ability to retire pages or use spares is essential to having a large lifetime under a bimodal model. In [8] , a 3D main memory implemented with PCM is presented, which uses row-level rotation and segment swapping (SS) for wearleveling. The SS used in [8] uses one counter per page and thus may be too costly, especially for small pages and large memories. In [5] , another 3D main memory that uses PCM is introduced. It uses byte shifting at row level (BS) and SS as wear-leveling mechanisms. These techniques would follow the analysis of Section II. In [11] , a low-cost tablebased wear-leveling algorithm is introduced. It is simple to modify the proposed architecture to support retiring pages and a PS algorithm.
VI. Conclusion
It has been especially challenging to the computer architecture community to compute the lifetime of wearprone memory devices, such as PCM and Flash, given there are several possible models for cell endurance and endurance management. In this paper, we developed the analysis and models that can be used to compute the memory lifetime for different endurance distributions We consider two general endurance strategies based on physical sparing (PS) and physical capacity degradation (PCD). We show that under constant endurance, PCD achieves a higher lifetime. However, if there are weak and strong cells, we show the relationship of the endurance algorithm (PS or PCD) to the amount of excess capacity, and how this relationship affects the lifetime of the memory. Our models and analysis can be used to implement a tool to determine the best endurance management algorithm and the minimum amount of excess capacity needed for a long memory lifetime. This choice is an important one since it can extend the memory lifetime by up to two times under specific conditions.
