This paper summarizes our work on experimentally characterizing, mitigating, and recovering data retention errors in multi-level cell (MLC) NAND ash memory, which was published in HPCA 2015 [10] , and examines the work's signi cance and future potential. Retention errors, caused by charge leakage over time, are the dominant source of ash memory errors. Understanding, characterizing, and reducing retention errors can signi cantly improve NAND ash memory reliability and endurance. In this work, we rst characterize, with real 2Y-nm MLC NAND ash chips, how the threshold voltage distribution of ash memory changes with di erent retention agesthe length of time since a ash cell was programmed. We observe from our characterization results that 1) the optimal read reference voltage of a ash cell, using which the data can be read with the lowest raw bit error rate (RBER), systematically changes with its retention age, and 2) di erent regions of ash memory can have di erent retention ages, and hence di erent optimal read reference voltages.
Introduction
Over the past decade, the capacity of NAND ash memory has been increasing continuously, as a result of aggressive process scaling and the advent of multi-level cell (MLC) technology. This trend has enabled NAND ash memory to replace spinning disks for a wide range of applications -from high performance clusters and large-scale data centers to consumer PCs, laptops, and mobile devices. Unfortunately, as ash density increases, ash memory cells become more vulnerable to various types of device and circuit level noise [3, 4, 5, 8, 86 ] -e.g., retention noise [3, 4, 5, 8, 12, 13, 70, 80, 91] , read disturb noise [3, 4, 5, 6, 15, 91] , cell-to-cell program interference noise [3, 4, 5, 6, 8, 11, 14] , and program/erase (P/E) cycling noise [3, 4, 5, 8, 9] . These are sources of errors that can signicantly degrade NAND ash memory reliability.
A traditional solution to overcome ash errors, regardless of their source, is to use error-correcting codes (ECC) [3, 4, 5, 30, 66] . By storing a certain amount of redundant bits per unit data, ECC can detect and correct a limited number of raw bit errors. With the help of ECC, ash memory can hide these errors from the users until the number of errors per unit data exceeds the correction capability of the ECC. Flash memory designers have been relying on stronger ECC to compensate for lifetime reductions due to technology scaling. However, stronger ECC, which has higher capacity and implementation overhead, has diminishing returns on the amount of ash lifetime improvement [12, 13] . As such, we intend to look for more e cient ways of reducing ash errors.
Retention errors, caused by charge leakage over time after a ash cell is programmed, are the dominant source of ash memory errors [3, 4, 5, 8, 12, 13, 109] . The amount of charge stored in a ash memory cell determines the threshold voltage level of the cell, which in turn represents the logical data value stored in the cell. As illustrated in Figure 1 , the threshold voltage (V th ) range of a 2-bit MLC NAND ash cell is divided into four regions by three read reference voltages, V a , V b , and V c . The region in which the threshold voltage of a ash cell falls represents the cell's current state, which can be ER (or erased), P1, P2, or P3. Each state decodes into a 2-bit value that is stored in the ash cell (e.g., 11, 10, 00, or 01). Reproduced from [15] .
As the manufacturing process technology for NAND ash memory scales to smaller feature sizes, the capacitance of a ash cell, and the number of electrons stored in the cell, decrease. State-of-the-art MLC ash memory cells can store only ∼100 electrons [10, 81] . Gaining or losing several electrons in a ash cell can signi cantly change the cell's voltage level and eventually alter the state of the cell. In addition, MLC technology reduces the size of the threshold voltage window [9] , i.e., the span of threshold voltage values corresponding to each logical state, in order to store more states in a single cell. This also makes the state of a cell more likely to shift due to charge loss caused by retention noise. As such, for NAND ash memory, retention errors are one of the most important limiting factors of more aggressive process scaling and MLC technology.
One way to reduce retention errors is to periodically read, correct, and reprogram the ash memory before the number of errors accumulated over time exceed the error correction capability of the ECC, i.e., the maximum number of raw bit errors tolerable by the ECC [12, 13, 69, 90] . However, this ash correct and refresh (FCR) technique has two major limitations: 1) FCR uses a xed read reference voltage to read data under di erent retention ages, which is suboptimal, and 2) FCR requires the ash controller to be consistently powered on so that errors can be corrected, limiting its applicability to enterprise deployments that have always-on power supplies.
In our HPCA 2015 paper [10] , we pursue a better understanding of retention error behavior to improve NAND ash reliability and lifetime, and nd better (and complementary) ways to mitigate ash retention errors. We characterize 1) the distortion of threshold voltage distribution at di erent retention ages, i.e., the idle time after the data is programmed to the ash memory, for state-of-the-art 2Y-nm (20-to 24-nm) NAND ash memory chips at room temperature, and 2) the retention age distribution of ash pages using disk traces taken from real workloads. Our key ndings are: 1. Due to threshold voltage distribution distortion, the optimal read reference voltages of ash cells, at which the minimum raw bit error rate (RBER) can be achieved, systematically shift to lower values as retention age increases. 2. Pages within the same ash block (the granularity at which ash memory can be erased) tend to have similar retention ages and hence similar optimal read reference voltages, whereas pages across di erent ash blocks have di erent optimal read reference voltages. Based on our ndings, we propose two mechanisms to mitigate data retention errors. First, we propose an online technique called Retention Optimized Reading (ROR). They key idea of ROR is to reduce the raw bit error rate by adaptively learning and applying the optimal read reference voltage for each ash block. Our evaluations show that ROR extends ash lifetime by 64% and reduces average error correction latency by 10.1%, with only 768 KB storage overhead for a 512 GB ash-based SSD. Second, we propose an o ine error recovery technique called Retention Failure Recovery (RFR). The key idea of RFR is to identify fast-and slow-leaking cells and probabilistically determine the original value of an erroneous cell based on its leakage-speed property and its threshold voltage. Our evaluations show that RFR can e ectively reduce the average raw bit error rate (RBER) by 50%, essentially doubling the error correction capability of ash memory, and allowing for the recovery of data otherwise uncorrectable by ECC.
We rst summarize our experimental characterization results (Section 2), and then introduce the Retention Optimized Reading (Section 3) and Retention Failure Recovery (Section 4) techniques.
Flash Data Retention Characterization
We use an FPGA-based ash memory testing platform to characterize real state-of-the-art 2Y-nm NAND ash memory chips [7, 8] . As absolute threshold voltage values are proprietary information to NAND ash vendors, we present our results using normalized voltages, where the nominal maximum value of V th is equal to 512 in our normalized scale, and where 0 represents GND. Section 3.1 of our HPCA 2015 paper [10] provides a detailed description of our experimental methodology. Figure 2 shows the threshold voltage distribution of ash memory at di erent retention ages for 8,000 P/E cycles. We make two observations from the gure. First, for the highervoltage states (P2 and P3), their threshold voltage distributions systematically shift to lower voltage values as the retention age grows. Second, the distributions of each state become wider with higher retention age, and that the distributions of states at higher voltage (e.g., P3) shift faster than those of states at lower voltage (e.g., P1). We nd that these changes due to retention leakage have an impact to the optimal read reference voltage (OPT), which is the read reference voltage between two states that minimizes the raw bit error rate (RBER). Figure 3 shows the optimal read reference voltage over retention age. We make two observations from the gure. First, Figure 3a shows a slightly decreasing trend of P1-P2 OPT (the optimal read reference voltage used to distinguish between cells in the P1 state and cells in the P2 state) over retention age. Second, we observe that P2-P3 OPT decreases much more rapidly with retention age than P1-P2 OPT, as shown in Figure 3b .
Characterized threshold voltage distribution
As the distributions continue to shift with growing retention age, the OPT for one retention age will be di erent than the OPT for a di erent age, suggesting that a dynamically changing OPT is ideal. To quantify how the choice of read reference voltage a ects RBER, we apply the optimal read reference voltages (OPTs) determined for {0, 1, 2, 6, 9, 17, 21, 28}-day retention ages to read 28-day-old data. Figure 4 shows the RBER obtained when reading the 28-day-old data with di erent OPTs, normalized to the RBER obtained when reading the data with the 28-day OPT. This gure shows that picking the correct value of OPT for each retention age results in a lower RBER. In turn, this allows us to extend the lifetime (i.e., the number of P/E cycles the device can tolerate) of the NAND ash memory if we always use the correct OPT based on the retention age of the data that is being read. In Section 3 of our HPCA 2015 paper [10] , we perform several other experimental characterization studies of ash memory data retention behavior, and make the following eight new ndings:
1. The threshold voltage distributions of the P2 and P3 states systematically shift to lower voltages with retention age. 2. The threshold voltage distribution of each state becomes wider with higher retention age. 3. The threshold voltage distribution of a higher-voltage state shifts faster than that of a lower-voltage state. 4. Both P1-P2 OPT and P2-P3 OPT become smaller over retention age. 5. P2-P3 OPT changes more signi cantly over retention age than P1-P2 OPT. 6. The optimal read reference voltage corresponding to one retention age is suboptimal (i.e., it results in a higher RBER) for reading data with a di erent retention age. 7. RBER becomes lower when the retention age for which the used read reference voltage is optimized becomes closer to the actual retention age of the data.
8. The lifetime of NAND ash memory can be extended if the optimal read reference voltage that corresponds to the retention age of the data is used.
Retention Optimized Reading (ROR)
To optimize ash memory performance without compromising ash lifetime, we rst breakdown and analyze the components of the ash memory read latency. A read operation typically makes use of the read-retry operation [3, 4, 5, 9, 28] , which performs multiple data read attempts using di erent read reference voltages until the read succeeds (i.e., ECC successfully corrects all of the raw bit errors). A detailed analysis of the ash memory read latency can be found in Section 4.1 of our HPCA 2015 paper [10] . We summarize the following four observations from this analysis:
• The read latency of NAND ash memory can be reduced by minimizing the number of reads performed during readretry.
• The number of reads can be reduced by using a closer-tooptimal starting read reference voltage in the read-retry process.
• The optimal read reference voltages of pages in the same block are close, while those of pages in di erent blocks are not always close.
• The optimal read reference voltage of pages in a block is upper-bounded by the optimal read reference voltage of the page in the block that was programmed last.
Based on these observations, we propose Retention Optimized Reading (ROR), which consists of two components: 1) an online pre-optimization algorithm that learns the starting read reference voltage for each block, and 2) an improved read-retry technique that uses the starting read reference voltage to reduce the search space of OPT (i.e., the optimal read reference voltage) for the block. Section 4.2 of our HPCA 2015 paper [10] provides a detailed description of the components of ROR. We brie y summarize the components below.
The rst component, the online pre-optimization algorithm, is triggered both daily and after power-on for each block. This algorithm consists of the following four steps:
• Step 1: The ash controller rst reads the highestnumbered page in a ash block (e.g., page 255 in a block that contains 256 pages), with any default read reference voltage V default , and attempts to correct the errors in the raw data read from the page. We chose the highestnumbered page in the block because it is programmed last, and, thus, has the lowest retention age and the highest OPT value within the block. Hence, we use the OPT for the highest-numbered page as a tight upper bound of OPT for the block. Next, we record the number of raw bit errors as the current lowest error count (N ERR ), and the applied read reference voltage as V ref = V default . If we cannot nd the error count (i.e., the error is uncorrectable), we record the maximum number of errors correctable by ECC as N ERR .
• Step 3, the most recently-used value of V ref is the optimal read reference voltage for the highest-numbered page. Thus, we record this voltage as the upper bound of the optimal read reference voltages for the block. The second component is an improved read-retry technique that takes advantage of the recorded starting read reference voltage. During a normal read operation, the ash controller rst attempts to read the data with the recorded starting read reference voltage. Then, since the recorded starting read reference voltage is the upper bound of the OPTs within the block, we iteratively decrease the read reference voltage until the read operation succeeds. Note that the starting read reference voltages are accessed frequently (on each read operation) by the ash controller, so we store them in the SSD's DRAM bu er to allow fast access.
Our key evaluation results show that ROR achieves the same ash lifetime improvements as naive read-retry, which has a read latency that is 64% longer than a baseline that uses a xed read reference voltage. Due to a reduction in raw bit error rate, ROR reduces the ECC decoding latency by 10.1% on average compared to the baseline, which is equivalent to a 2.4% reduction in overall ash read latency. Compared with the original read-retry technique, which we explain in detail in Section 4.1 of our HPCA 2015 paper [10] , ROR reduces the read-retry operation count by 70.4%, and thus reduces the overall read latency by the same fraction. This reduction is due to two reasons: 1) ROR starts the read-retry process at a close-to-optimal starting read reference voltage that is estimated and recorded daily and upon power-on; and 2) ROR approaches OPT in a known, informed direction from this starting read reference voltage.
Section 4.4 of our HPCA 2015 paper [10] provides more results from our evaluation of ROR. In our HPCA 2015 paper, we show that the performance overhead of ROR, which is periodically triggered by an online pre-optimization algorithm, can be largely hidden by executing the algorithm only when the SSD is idle, or in the background at a lower priority. This is because, even considering the worst-case scenario, we obtain an estimated pre-optimization latency of 3, 15, and 23 seconds for ash memory with a 1-day, 7-day, and 30-dayequivalent retention age, respectively. Since the ash pages within a block is programmed at similar times, the optimal read reference voltages of these pages are close. So we store one byte per block for each starting read reference voltage learned for the ER-P1 OPT, the P1-P2 OPT, and the P2-P3 OPT. We also show that ROR requires only 768 KB of storage overhead, to store the entire read reference voltage table for an assumed 512 GB ash drive.
Retention Failure Recovery (RFR)
Even with ROR, the retention error rate will eventually exceed the ECC limit as retention age keeps increasing. At that point, some reads will have more raw errors than can be corrected by ECC, preventing the drive from returning the data to the user. Traditionally, this would be the point of data loss and thus the end of ash memory lifetime.
We show that retention failure is avoidable under various circumstances. In Section 5.1 of our HPCA 2015 paper [10] , we show that high temperature can signi cantly increase the number of retention errors in a short period of time, which leads to unexpected data loss. For example, if the required refresh period of the ash memory is one week at room temperature, uncorrectable errors may start to accumulate after a mere 36 minutes. We also discuss why completely avoiding such retention failure is unrealistic. No previous technique can prevent data loss after retention failure happens.
We introduce Retention Failure Recovery (RFR), which enables us to recover data from a failed ash page o ine after the number of errors in the page exceed the total number of errors that ECC can correct. Due to process variation, different ash cells on the same chip can have di erent charge leakage speeds. We describe a technique to classify fast-and slow-leaking cells in just a few days, which enables RFR to probabilistically infer the original value stored in each ash cell. Our evaluation, based on data from real NAND ash chips, shows that RFR can reduce raw bit error rate by 50%, and thus ECC can then be used to recover a majority of the data in pages with retention failures. Figure 5 shows how the threshold voltage of a retentionprone cell (i.e., a fast-leaking cell, labeled P in the gure) decreases over time (i.e., the cell shifts to the left) due to retention leakage, while the threshold voltage of a retention-resistant cell (i.e., a slow-leaking cell, labeled R in the gure) does not change signi cantly over time. Retention Failure Recovery (RFR) uses this classi cation of retention-prone versus retention-resistant cells to correct the data from the failed page without the assistance of ECC. Without loss of generality, let us assume that we are studying susceptible cells near the intersection of two threshold voltage distributions X and Y, where Y contains higher voltages than X. Figure 5 highlights the region of cells considered susceptible by RFR using a box, labeled Susceptible. A susceptible cell within the box that is retention prone likely belongs to distribution Y, as a retention-prone cell shifts rapidly to a lower voltage (see the circled cell labeled P within the susceptible region in the gure). A retention-resistant cell in the same susceptible region likely belongs to distribution X (see the boxed cell labeled R within the susceptible region in the gure). RFR identi es fast-vs. slow-leaking cells, and uses selective bit ipping to correct retention failures, thus reducing RBER. With reduced raw bit errors, the read data may be reconstructed by ECC with a higher probability. RFR consists of the following four o ine steps, which are triggered when an uncorrectable error is found:
• Step 1: Identify data with a retention failure. Once the ash controller fails to read a ash page, a retention failure is identi ed on that page. • Step 2: Identify susceptible cells using three read operations. We read the failed page using three read reference voltages: OPT (the optimal read reference voltage) minus some margin δ (Step 2.1), OPT (Step 2.2), and OPT plus δ (Step 2.3). The value of δ is large enough to include the entire Susceptible region shown in Figure 5 . Figure 6a illustrates the identi cation of susceptible (i.e., risky) cells, which are denoted as type 1 We evaluate RFR on data programmed to random values that has 28-day equivalent retention age. In Step 3, we introduce an additional 12 days' worth of equivalent retention age. Figure 7 shows the resulting raw bit error rate of RFR over a range of P/E cycles (compared to that of the baseline). This gure shows that RFR reduces the RBER by 50%, averaged across all evaluated wearout levels (P/E cycles). Thus, we expect the number of raw bit errors to be halved, increasing the chances that these errors are correctable by ECC. 
Related Work
To our knowledge, our HPCA 2015 paper [10] is the rst to 1) experimentally characterize and comprehensively analyze how the threshold voltage distribution changes over di erent retention ages, as well as the implication of these changes on the read reference voltage and lifetime, using real state-of-theart 2Y-nm MLC NAND ash memory chips; and 2) proposes two novel techniques to mitigate the impact of retention age online and to recover from data loss by exploiting retention behavior. In this section, we brie y discuss various related works.
Works on NAND Flash Memory
NAND Flash Memory Retention Error Characterization. Multiple prior works characterize NAND ash data retention, but mainly in terms of RBER [8, 12, 13, 80] . These works show that 1) retention errors are the dominant errors in NAND ash memory, and 2) the retention error rate increases with the retention age and the P/E cycle. Papandreou et al. [91] characterize the retention e ect on threshold voltage distributions under high temperature baking, and nd that the distribution shifts to lower voltage over retention time, and so does the optimal read reference voltage. In contrast, our HPCA 2015 paper [10] characterizes data retention under room temperature, which is closer to how NAND ash memories are typically used [10] . Our recent work characterizes how data retention a ects the threshold voltage distribution for TLC NAND ash memory [3, 4, 5] , making similar ndings as our HPCA 2015 paper [10] .
NAND Flash Memory Error Characterization. Prior works study di erent types of NAND ash memory errors in MLC, planar NAND ash memory, including P/E cycling errors [9, 71, 80, 91, 93] , programming errors [6, 71, 93] , cellto-cell program interference errors [9, 11, 14] , retention errors [9, 10, 12, 80, 91] , and read disturb errors [15, 80, 91] . These works characterize how raw bit error rate and threshold voltage distributions change with various types of noise. Our recent work characterizes the same types of errors in planar TLC NAND ash memory and has similar ndings [3, 4, 5] . Thus, we believe that most of the ndings on MLC NAND ash memory can be generalized to any types of planar NAND ash memory devices (e.g., SLC, MLC, TLC, or QLC). Recent works [77, 89, 101] have also studied SSD errors in the eld, and have shown the system-level implications of these errors in large-scale data centers. Unlike our characterization, these in-the-eld studies do not have access to the underlying NAND ash memory within the SSDs that they test, and, thus, are unable to show detailed data retention behavior.
3D NAND Flash Memory Error Characterization. Recently, manufacturers have begun to produce SSDs that contain three-dimensional (3D) NAND ash memory [36, 42, 78, 79, 92, 117] . In 3D NAND ash memory, multiple layers of ash cells are stacked vertically to increase the density and to improve the scalability of the memory [117] . In order to achieve this stacking, manufacturers have changed a number of underlying properties of the ash memory design. We refer readers to our prior work for a detailed comparison between 3D NAND ash memory and planar NAND ash memory [3, 4, 5] . Previous works [22, 82] compare the retention loss between 3D charge trap NAND ash memory and planar NAND ash memory through real device characterization, and nd that 3D charge trap cells leak charge faster than planar NAND cells and thus experience the phenomenon of early retention loss. Our recent work [72] characterizes the impact of dwell time, i.e., the idle time between consecutive program cycles, and environmental temperature on the retention loss speed and program variation of 3D charge trap NAND ash memory, and proposes techniques to mitigate these issues to improve ash memory lifetime. Recent work [113] characterizes the latency and raw bit error rate of 3D NAND ash memory devices based on oating gate cells, and makes similar observations as those for planar NAND ash memory devices based on oating gate cells. Prior works have reported several di erences between 3D NAND and planar NAND through circuit level measurements, including the fact that 3D NAND ash cells exhibit 1) smaller program variation at high P/E cycle [92] , 2) smaller program interference [92] , and 3) early retention loss [22, 22, 82] . The eld (both academia and industry) is currently in much need of detailed rigorous experimental characterization and analysis of state-of-the-art 3D NAND ash memory devices.
Retention Error Mitigation Using Periodic Refresh.
Prior works [12, 13, 69, 90] propose to use periodic refresh to mitigate retention errors. Cai et al. [12, 13] introduce 1) remapping-based refresh, which periodically reads data from each valid ash block, corrects any data errors, and remaps the data to a di erent physical location, 2) in-place refresh, which incrementally replenishes the lost charge of each page at its current location, and 3) adaptive refresh, which allows the controller to adaptively adjust the rate that the refresh mechanisms are invoked based on the wearout (i.e., the current P/E cycle count) of the NAND ash memory [12, 13] ; or the temperature of the SSD [8, 10] . However, these techniques 1) require the system to be consistently powered on, and 2) are unaware of the fact that the optimal read reference voltage changes with di erent retention age. Note that these works always apply a xed read reference voltage regardless of the retention age of the cell, which is suboptimal for reading ash blocks at di erent retention ages. In contrast, our ROR technique optimizes the read reference voltage of each ash block based on its retention age, leading to signi cant lifetime improvements. Several works [23, 70, 104] nd that refresh operations consume a large number of P/E cycles, and propose techniques that exploit workload writehotness to relax the guaranteed retention time of NAND ash memory without requiring refresh. For example, WARM [70] partitions write-hot and write-cold data using a lightweight mechanism designed for ash memory, and eliminates the need to refresh write-hot data, leading to signi cant lifetime improvements over existing periodic refresh mechanisms. Our techniques can be combined with such refresh elimination techniques for higher lifetime and performance.
Read Reference Voltage Optimization.
A few works [11, 14, 91] propose optimizing the read reference voltage. Cai et al. [14] propose a technique to calculate the optimal read reference voltage from the mean and variance of the threshold voltage distributions, which are characterized by the read-retry technique [9] . The cost of such a technique is relatively high, as it requires periodically reading ash memory with all possible read reference voltages to discover the threshold voltage distributions. Papandreou et al. [91] propose to apply a per-block close-to-optimal read reference voltage by periodically sampling and averaging 6 OPTs within each block, learned by exhaustively trying all possible read reference voltages. In contrast, ROR can nd the actual optimal read reference voltage at a much lower latency, thanks to the new ndings and observations in our HPCA 2015 paper [10] . We show that ROR greatly outperforms naive read-retry. The latter is signi cantly simpler than the mechanism proposed in [91] . Recently, Luo et al. [71] propose to accurately predict the optimal read reference voltage using an online ash channel model for each chip learned online. Cai et al. [15] propose a new technique called V pass tuning, which tunes the passthrough voltage, i.e., a high reference voltage applied to turn on unread cells in a block, to mitigate read disturb errors. Du et al. [27] propose to tune the optimal read reference voltages for ECC soft decoding to improve the ECC correction capability (i.e., the maximum number of errors that ECC can correct). Fukami et al. [28] propose to use read-retry to improve the reliability of the chip-o forensic analysis of NAND ash memory devices. Our proposals are complementary to all these techniques.
Error Recovery. To our knowledge, our HPCA 2015 paper [10] proposes the rst mechanism that can recover data even after ECC is unable to successfully correct all of the errors due to retention loss. One of our works [15] builds on our HPCA 2015 paper and adapts the RFR mechanism to opportunistically recover from read disturb errors instead of retention errors. FlashDe brillator (FD) [39] improves upon RFR to recover from data retention errors online. FD recovers data retention errors online by applying a sequence of diagnostic pulses that recharge the fast-leaking cells. This helps recover otherwise uncorrectable errors in two ways: (1) fast-leaking cells may be recharged back to the correct state, (2) fast-leaking cells recharge faster than slow-leaking cells, thus fast-leaking cells can be identi ed as the cells whose threshold voltages increase faster during the diagnostic pulses. These two more recent works [15, 39] directly build upon our HPCA 2015 paper.
Data Retention Errors in DRAM
DRAM uses the charge within a capacitor to represent one bit of data. Much like the oating gate within NAND ash memory, charge leaks from the DRAM capacitor over time, leading to data retention issues. Unlike a NAND ash cell, where leakage typically leads to data loss after several days to years of retention time, leakage from a DRAM cell leads to data loss after a retention time on the order of milliseconds to seconds [67] .
The retention time of a DRAM cell depends upon several factors [67] , including (1) manufacturing process variation and (2) temperature. Manufacturing process variation a ects the amount of current that leaks from each DRAM cell's capacitor and access transistor [67] . As a result, the retention time of the cells within a single DRAM chip vary signi cantly, resulting in strong cells that have high retention times and weak cells that have low retention times within each chip. The operating temperature a ects the rate at which charge leaks from the capacitor. As the operating temperature increases, the retention time of a DRAM cell decreases exponentially [29, 67] .
Due to the rapid charge leakage from DRAM cells, a DRAM controller periodically refreshes all DRAM cells in place [17, 38, 44, 67, 68, 94, 97] (similar to the periodic refresh techniques used in NAND ash memory, but at a much smaller time scale). DRAM standards require a DRAM cell to be refreshed once every 64 ms [38] . As the density of DRAM continues to increase over successive product generations (e.g., by 128x between 1999 and 2017 [16, 18] ), enabled by the scaling of DRAM to smaller manufacturing process technology nodes [73, 84, 85, 87] , the performance and energy overheads required to refresh an entire DRAM module have grown signi cantly [17, 68, 84, 85, 87] . It is expected that the refresh problem will get signi cantly worse and limit DRAM density scaling, as described in a recent work by Samsung and Intel [43] and by our group [68] . Prior analysis shows that when DRAM chip density reaches 64 Gbit, nearly 50% of the data throughput is lost due to the high amount of time spent on refreshing all of the rows in the chip, and nearly 50% of the DRAM chip power is spent on refresh operations [68] . Thus, data retention problems and refresh pose a clear challenge to DRAM scalability.
Various experimental studies of real DRAM chips (e.g., [32, 44, 45, 50, 62, 67, 68, 94, 97] ) have studied the data retention time of DRAM cells in modern chips, and have shown that the vast majority of DRAM cells can retain data without loss for much longer than the 64 ms retention time speci ed by DRAM standards. A number of works take advantage of this variability in data retention time behavior across DRAM cells, by reducing the frequency at which the vast majority of DRAM rows within a module are refreshed (e.g., [2, 37, 44, 46, 67, 68, 94, 97, 110] ), or by reducing the interference caused by refresh requests on demand requests (e.g., [17, 83, 108] ).
More ndings on the nature of DRAM data retention and associated errors, as well as relevant experimental data from modern DRAM chips, can be found in our prior works [16, 17, 32, 44, 45, 46, 47, 62, 67, 68, 84, 94, 97] . We also refer the readers to prior works on the design and operation of the underlying DRAM architecture [17, 18, 19, 20, 32, 33, 49, 51, 52, 53, 54, 55, 60, 61, 62, 63, 64, 67, 68, 94, 102, 103] .
Errors in Emerging Nonvolatile Memory Technologies
DRAM operations are several orders of magnitude faster than SSD operations, but DRAM has two major disadvantages. First, DRAM o ers orders of magnitude less storage density than NAND-ash-memory-based SSDs. Second, DRAM is volatile (i.e., the stored data is lost on a power outage). Emerging nonvolatile memories, such as phase-change memory (PCM) [57, 58, 59, 76, 98, 112, 115, 121] , spin-transfer torque magnetic RAM (STT-RAM or STT-MRAM) [56, 88] , metaloxide resistive RAM (RRAM) [111] , and memristors [26, 107] , are expected to bridge the gap between DRAM and SSDs, providing DRAM-like access latency and energy, and at the same time SSD-like large capacity and nonvolatility (and hence SSD-like data persistence). These technologies are also expected to be used as part of hybrid memory systems (also called heterogeneous memory systems), where one part of the memory consists of DRAM modules and another part consists of modules of emerging technologies [21, 24, 25, 41, 65, 74, 75, 95, 98, 99, 100, 115, 116, 118, 119] .
PCM-based devices are expected to have a limited lifetime, as PCM can only endure a certain number of writes [57, 98, 112] , similar to the P/E cycling errors in NAND-ash-memorybased SSDs (though PCM's write endurance is higher than that of SSDs). PCM su ers from (1) resistance drift [35, 96, 112] , where the resistance used to represent the value becomes higher over time (and eventually can introduce a bit error), similar to how charge leakage in NAND ash memory and DRAM lead to retention errors over time; and (2) write disturb [40] , where the heat generated during the programming of one PCM cell dissipates into neighboring cells and can change the value that is stored within the neighboring cells. STT-RAM su ers from (1) retention failures, where the value stored for a single bit (as the magnetic orientation of the layer that stores the bit) can ip over time; and (2) read disturb (a conceptually di erent phenomenon from the read disturb in DRAM and ash memory), where reading a bit in STT-RAM can inadvertently induce a write to that same bit [88] .
Due to the nascent nature of emerging nonvolatile memory technologies and the lack of availability of large-capacity devices built with them, extensive and dependable experimental studies have yet to be conducted on the reliability of real PCM, STT-RAM, RRAM, and memristor chips. However, we believe that error mechanisms conceptually or abstractly similar to those we discussed for ash memory and DRAM are likely to be prevalent in emerging technologies as well (as supported by some recent studies [1, 40, 48, 88, 105, 106, 120] ), albeit with di erent underlying mechanisms and error rates. We expect that the ROR and RFR techniques we propose in our HPCA 2015 paper [10] can be easily adapted to NVM technologies.
Signi cance
Our HPCA 2015 paper [10] provides extensive characterization data and proposes novel mechanisms to mitigate retention errors in modern NAND ash memory and recover data when ECC fails. We believe that our characterization and mechanisms will have a signi cant impact on the community, as evidenced by multiple recent works directly building upon our HPCA 2015 paper [15, 39, 72] .
Long-Term Impact
We believe our work will have long-term impact for the following three reasons. First, as NAND ash memory becomes denser in the future, data retention will become a bigger issue, and thus a better understanding of its implication and characteristics will be important to help maintain NAND ash reliability after scaling [3, 4, 5, 84] . Second, we propose an online technique that reduces ash read latency, and we give insights into the ash read-retry algorithm, thereby hopefully inspiring future works to further optimize ash read latency. Third, we propose an o ine technique that leverages underlying ash characteristics to enable recovery from a retention failure even after the drive fails to correct it, thereby hopefully inspiring future works to look for more ways to prevent data loss.
Data Retention. Our work provides a comprehensive analysis of the retention loss e ect on real NAND ash memory chips, which enhances the understanding of the retention loss e ect in the research community. We hope that our analysis and solutions can inspire more works to handle data retention in better ways. As planar NAND ash memory becomes denser, each ash memory cell holds less charge and becomes more vulnerable to retention loss [8, 12] . Thus, in the future, we expect data retention to become a more important problem [3, 4, 5, 84] , and expect that industry will be more open to adapt new solutions like our proposals, ROR and RFR. In fact, several ash-based SSDs currently use refresh as a solution to mitigate retention errors [31, 34, 114] . Our work shows that we can go signi cantly beyond refresh to tolerate the data retention problem in NAND ash memory.
Read Performance Optimization. The read performance advantage of ash memory over hard disk drives makes ash-based SSDs more appealing than hard disk drives. However, many existing solutions, such as read-retry [9, 28] , trade o ash performance for reliability. Our HPCA 2015 paper [10] is the rst to point out the read performance problem, and to provide a detailed analysis and new solution to this problem. We hope that our work can enhance the research community's understanding of ash read performance and bring more attention to ash read performance, which is critically important to overall system performance. Techniques that are developed in DRAM to reduce read latency [17, 18, 19, 20, 33, 51, 52, 53, 54, 60, 61, 62, 63, 64, 68, 94, 102, 103] can prompt inspiration for NAND ash memory.
Data Recovery. Prior to our work, after a retention failure happens, an uncorrectable error and resulting data corruption was considered to be unrecoverable from, resulting in data loss. To our knowledge, our HPCA 2015 paper [10] is the rst to show that it is actually possible to recover this data using our RFR mechanism. As the reliability of NAND ash memory decreases, and the popularity of ash-based SSDs increases, SSD failures are expected to increase, creating a greater need for recovery techniques that can retrieve previously-unrecoverable data. In light of this, recent works [15, 39] have directly built upon RFR to provide additional data recovery mechanisms. We hope that our work draws more attention to ash memory data recovery, and inspires further solutions to this important problem.
New Research Directions
Our HPCA 2015 paper [10] presents characterization results for data retention in real NAND ash chips. By making such data and knowledge available, we believe that the ash memory and SSD research communities can have a better understanding of data retention, and can therefore develop better solutions to tackle the retention problem in the future. We hope that our work will continue to inspire future works in ash memory that can provide a comprehensive characterization and analysis of other NAND ash memory behavior using real chips, such as program/erase cycling and cell-to-cell program disturbance. We also hope that our ROR and RFR techniques bring more attention to both the ash read performance problem and data recovery problem, and that they will inspire researchers from both academia and industry to develop and adopt new solutions.
Conclusion
Our HPCA 2015 paper [10] comprehensively characterizes and analyzes how the threshold voltage distribution and the optimal read reference voltages of state-of-the-art 2Y-nm MLC NAND ash memory change over di erent retention ages. Based on these analyses, the paper proposes two new techniques. Retention Optimized Reading (ROR) improves reliability, lifetime, and performance of MLC NAND ash memory at modest storage cost by optimizing the read reference voltage of each ash memory block based on its retention age. We demonstrate signi cant bene ts with ROR in terms of reduced RBER, extended ash lifetime, and reduction in ash read latency. Retention Failure Recovery (RFR) recovers data with uncorrectable errors by identifying and probabilistically correcting ash cells with retention errors. We demonstrate large raw bit error rate reductions with RFR. We hope that our comprehensive characterization of data retention in ash memory will enable better understanding of ash retention errors and motivate other new techniques to overcome these errors. We believe the importance of our two new techniques (ROR and RFR) will grow as NAND ash memory scales to smaller feature sizes and becomes even less reliable in the future.
