This paper presents a new reliability threat that affects 3D-NAND Flash memories when a read operation is performed exiting from an idle state. In particular, a temporary large increase of the fail bits count is reported for the layers read as first after a sequence of program/verify and a idle retention phase. The phenomenon, hereafter called Temporary Read Errors (TRE), is not due to a permanent change of cell threshold voltage between the program verify and the following read operations, but to its transient instability occurring during the idle phase and the first read operations performed on a block. The experimental analysis has been performed on off-the-shelf gigabit-array products to characterize the dependence on the memory operating conditions. The TRE is found to be strongly dependent on the page read, on the read temperature and on the time delay between the first and the second read after the idle state. To emphasize its negative impact at system-level, we have evaluated the induced performance drop on Solid State Drives architectures.
I. INTRODUCTION
The 3D-NAND Flash memory technology is now a data storage stronghold in mobile and Solid State Drives (SSD) applications. So far, several integration options for 3D-NAND reached a fabrication yield and a maturity level good enough for mass production, outmatching at the same time planar technologies in terms of bit density and performance [1] , [2] , [3] . Concerning reliability, several benchmarks showed that 3D-NAND Flash are better than state-of-the-art 2D NAND Flash [4] , [5] , [6] , although all the issues concerning the cells threshold voltage (V T ) instabilities like Random Telegraph Noise (RTN), cycling and retention failures are inherited to a large extent [7] .
Besides those intrinsic mechanisms, there are additional reliability detractors related to the structure of the 3D-NAND cells. The use of a poly-silicon (poly-Si) material for the cells' channel and their vertical integration in relatively long pillars exposes the V T distributions to unwanted broadening and shifts [1] , [8] , [9] , [10] , [11] , [12] .
In this work, for the first time in 3D-NAND Flash gigabit array products, we show the presence of a reversible phenomenon, hereafter denoted as Temporary Read Errors (TRE). Such effect occurs during the read operation that impacts on the first group of cells read in a memory block when exiting from an idle state after a program/verify operation. The TRE issue is due to a transient instability of cell V T occurring during the idle phase and the first read operations performed on a block.
II. EXPERIMENTAL SETUP AND METHODS
The electrical characterization of the TRE has been performed on off-the-shelf Triple Level Cell (TLC) 3D-NAND Flash products featuring M layers (M ≥ 32) with the array architecture and V T distribution coding mapped in three TLC pages (i.e., Lower Significant Bit -LSB, Central Significant 
FIGURE 2. Automated test equipment exploited for fail bits extraction (3D-NAND Flash highlighted in blue).

FIGURE 3. Test schemes exploited for TRE assessment.
Bit -CSB, and Most Significant Bit -MSB) as described in Fig. 1 . The number of layers actually represents the number of cells stacked in the vertical direction along the NAND string. To consider the typical manufacturing variability, the tests have been repeated on several memory blocks and dies of the same product. The measurements were taken by a custom automated test equipment that allows extracting the number of fail bits after a read operation in every memory location (see Fig. 2 ). To rule out topological artifacts and interference on the TRE, we have programmed the samples with TLC random patterns and we disabled any error correction functionality of the chip. No modifications were applied to the standard working voltages of the devices and no special test modes were exploited to filter the fail bits count distribution.
The fail bits extraction has been performed for all physical layers (L i ) in the two following conditions (see Fig. 3 ): a) after 1000 program/erase cycles (N P/E ) and reading immediately at the end of the program operation starting from L 0 and sequentially reaching L M−1 ; b) after N P/E = 1000 and an idle state (characterized by a retention time, t RET = 1 h, and temperature T RET = 115 • C). The values of t RET and T RET are chosen accordingly to the JEDEC tests specifications for the NAND Flash enterprise qualification procedure [13] . Same standards are adopted for P/E cycling which resulted in an equivalent 500 hours cycle time at an 85 • C temperature for all tests. Particular attention has been paid to avoid any cross-temperature artifacts that could erroneously increase the fail bit count [14] .
III. EXPERIMENTAL CHARACTERIZATION RESULTS
Fig. 4 compares the results of the test a) with test b)
by evidencing the fail bits count retrieved for each physical layer in a block after three consecutive block reads. In the former test there is no perceivable difference in terms of fail bits between the block reads, whereas in the latter we appreciate a high fail bit count in the first layer of the block, with a general trend exposing the first block read as the one with the highest fail bits count. This is an evidence of the TRE phenomenon in 3D-NAND Flash memories if the read operation is performed after a reasonable amount of time in which the block has been in an idle state.
To exclude any topological dependency and to demonstrate a general presence of the phenomenon in all layers of 3D-NAND Flash blocks we repeated the readout of five adjacent pages in the vertical direction for three times considering the test case b) (i.e., after an idle state). The experiment is performed on different read starting positions (i.e., L 0 , L M/2−1 , and L M−1 , respectively). The number of fail bits is normalized with respect to the maximum count retrieved for each experiment. A large difference is perceived between the first and the second read (1R, 2R), but not between the second and the third (2R, 3R) for all cases (see Fig. 5 ). This confirms that the TRE is visible mainly on the first read operation thus exposing its temporary nature, while proving its independence on the topological location of the first physical layer read. Fig. 6 shows that TRE is appreciable in all TLC pages with a larger impact evidenced for the LSB. The normalization of the fail bits is performed on the maximum count retrieved per page. Fig. 7 shows, for the cases a) and b), the boxplot of the fail bits count distribution per layer normalized to the maximum number retrieved in the test case. Each boxplot refers to a data population of about 400 kbits spanning over all TLC pages on multiple tested blocks and dies. The figure highlights two fundamental contributions in the explanation of the TRE: i) if the read operation is performed after a reasonable amount of time in which the block has been in an idle state (case b), we observe an increase of the fail bits count for the first layer read within a block (up to 1.67 times, considering the median value of the distribution, with respect to the next layer in read sequence), gradually assessing to a more stable condition; ii) the fail bits count in the first layer read in the sequence is comparable to the following ones if the read operation is performed immediately after the last program verify operation (case a, where the difference in the normalized fail bits count for group L 0 − L 5 and group L M−6 − L M−1 is an artifact introduced by the extremely low fail bits statistics). Fig. 8 confirms that the TRE depends on the working conditions of the memory. By comparing the average fail bits count on the layer read as first and that on the rest of the block we note again that in case a) no difference is found, whereas in case b), two effects are evidenced: the TRE effects in the layer read as first and a general fail bits count increase ascribed to unavoidable V T degradation induced by retention effects [15] , [16] , [17] . Finally, to monitor the TRE relationship with time and temperature, we performed an additional experiment on a single physical layer consisting of a set of two consecutive reads separated by a t delay. Three different read temperatures were considered: 25 • C, 55 • C, and 70 • C. Each point of Fig. 9 represents the fail bits count retrieved after the second read (R2) normalized with respect to that retrieved after the first read (R1), as a function of the t delay between the two reads. For each experimental point, R1 is performed after an idle state (t IDLE = 2 h) ensuring similar starting conditions for each couple of R1 and R2 reads. The results evidence that the R2 fail bits count decreases as t increases, reaching a temperature dependent minimum in the [1s; 10s] region. We must note that, as t increases above 100s, the number of fail bits retrieved after R2 increases again since there is a re-occurrence of the TRE because R2 is performed in similar conditions of R1 (i.e., after a sufficiently long idle state). The dependence on the TLC page is also exposed in Fig. 10 . To consider the typical manufacturing variability, the tests have been repeated on several memory blocks and dies of the same product with similar outcome.
IV. DISCUSSION ON THE TRE NATURE
The characterization results evidenced that the phenomenon is unlikely ascribed only to a general V T instability that occurs in presence of stochastic charge trapping/de-trapping events in the tunnel-oxide/interface like for RTN [7] , [18] , [19] since the fail bits increase is localized in the first group of cells read (see Fig. 5 ). Indeed, the RTN is an intrinsic characteristic common to all the cells in the array and therefore it is a fundamental contributor of the fail bits count distribution for every layer. The TRE reversibility and reproducibility also allows ruling out any modification of the cells' trapped charge like that caused by the early retention loss [20] or by program interference [21] . Given those assumptions, we suggest that TRE is due to a transient phenomenon occurring during the idle phase and the first read operations performed on a block and altering cell V T .
Such mismatch could be attributed to several physical mechanisms already debated in literature [10] , [11] , [12] related to the peculiar behavior of the poly-Si channel. Although all the tests in those works were performed either on single cells/test element groups or on TCAD simulations, we found some evident similarities especially in the transient current instabilities phenomenology. The results obtained by the characterizations in [11] , considering a cell structure similar to the one integrated in our 3D NAND Flash samples (i.e., poly-Si channel and a floating body), show the presence of a current transient instability originated by grain-boundary (GB) traps in poly-Si. During the program operation, there is a strong GB trapping before the program verify that will lead to a different discharge transient compared to what normally happens during read. In this case, an I BL mismatch is produced during sensing that affects V T distributions and the fail bits count. In [10] , a similar explanation of the phenomenon is given except for the inclusion in the I BL instability culprits of the blocking dielectric traps. However, even if we cannot rule out the impact of those contributors, our experimental results are comparable to the conditions where their measured instability is dominated by GB traps in the poly-Si channel. Another potential source of I BL mismatch is discussed in [12] , where the down-coupling phenomenon triggered by a temporary floating-state of the channel during the program verify is evidenced. Additional investigations on large 3D-NAND Flash array structures are to be performed to verify whether, besides those physical explanations, there are other extrinsic causes related to the time constants necessary to charge/discharge long poly-Si pillars when exiting from idle state.
V. SYSTEM-LEVEL IMPACT OF TRE
To assess the impact of the TRE at system-level we have simulated an enteprise-class TLC 3D-NAND Flash-based SSD architecture (see Fig. 11 ) with the emulator described in [22] . The main features of emulated drive are summarized in Table 1 . The SSD controller implements a powerful Low-Density Parity Check (LDPC) Error Correction Code (ECC) that corrects up to 120 fail bits over 1KB sized codewords with soft-decoding capabilities. This latter option is enabled when the number of fail bits is higher than the ECC correction strength. We must recall that soft-decoding should be avoided as it is extremely time consuming then causing significant performance drops on the drive [23] . This calls for the evaluation of the soft-decoding trigger rate. Fig. 12 shows that without accounting for the presence of the TRE on the 3D-NAND Flash memories inside the SSD there would be a 3% soft-decoding trigger rate. However, that turns into an underestimation since the presence of the TRE will materialize in an increase up to 11%. Such result indicates both a reliability and a possible performance loss of the drive.
To assess that, we evaluated the impact on the SSD latency figure of merit at different percentiles of the cumulative distribution function (CDF) retrieved after 1 million read and write accesses to the drive (see Fig. 13 ). If the 99 th percentile of the CDF is considered, we can see a 2 ms latency difference whether or not the TRE is accounted in the simulations. In this case, we can expose that TRE will represent a severe limiter for fast storage applications and will pose serious challenges for device and system mitigation in order to keep the SSD quality-of-service to an acceptable level.
VI. CONCLUSION
In this work we exposed the TRE phenomenon as a potential reliability issue in 3D-NAND Flash technology through an experimental characterization of the fail bits count statistics in different working corners. Its occurrence is attributed to a transient phenomenon occurring during the idle phase and the first read operations performed on a block and altering cell V T . The time and temperature dependency of TRE suggests possible mitigation to be applied either at device or system level. As an example, by repeating after few seconds the first read operation before considering the data as valid.
