Abstract-The scaling of high density NOR Flash memory devices with multi level cell (MLC) hits the reliability break wall because of relatively high intrinsic bit error rate (IBER). The chip maker companies offer two solutions to meet the output bit error rate (OBER) specification: either partial coverage with error correction code (ECC) or data storage in single level cell (SLC) with significant increase of the die cost. The NOR flash memory allows to write information in small portions, therefore the full error protection becomes costly due to high required redundancy, e.g. ∼50%. This is very different from the NAND flash memory writing at once large chunks of information; NAND ECC requires just ∼10% redundancy. This paper gives an analysis of a novel error protection scheme applicable to NOR storage of one byte. The method does not require any redundant cells, but assumes 5th program level. The information is mapped to states in the 4-dimensional space separated by the minimal Manhattan distance equal 2. This code preserves the information capacity: one byte occupies four memory cells. We demonstrate the OBER ∼ IBER 3/2 scaling law, where IBER is calculated for the 4-level MLC memory. As an example, the 4-level MLC with IBER ∼ 10 −9 , which is unacceptable for high density products, can be converted to OBER ∼ 10 −12 . We assume that the IBER is determined by the exponentially distributed read noise. This is the case for NOR Flash memory devices, since the exponential tails are typical for the random telegraph signal (RTS) noise and for most of the charge loss, charge gain, and charge sharing data losses.
Abstract-The scaling of high density NOR Flash memory devices with multi level cell (MLC) hits the reliability break wall because of relatively high intrinsic bit error rate (IBER). The chip maker companies offer two solutions to meet the output bit error rate (OBER) specification: either partial coverage with error correction code (ECC) or data storage in single level cell (SLC) with significant increase of the die cost. The NOR flash memory allows to write information in small portions, therefore the full error protection becomes costly due to high required redundancy, e.g. ∼50%. This is very different from the NAND flash memory writing at once large chunks of information; NAND ECC requires just ∼10% redundancy. This paper gives an analysis of a novel error protection scheme applicable to NOR storage of one byte. The method does not require any redundant cells, but assumes 5th program level. The information is mapped to states in the 4-dimensional space separated by the minimal Manhattan distance equal 2. This code preserves the information capacity: one byte occupies four memory cells. We demonstrate the OBER ∼ IBER 3/2 scaling law, where IBER is calculated for the 4-level MLC memory. As an example, the 4-level MLC with IBER ∼ 10 −9 , which is unacceptable for high density products, can be converted to OBER ∼ 10 −12 . We assume that the IBER is determined by the exponentially distributed read noise. This is the case for NOR Flash memory devices, since the exponential tails are typical for the random telegraph signal (RTS) noise and for most of the charge loss, charge gain, and charge sharing data losses.
Index Terms-NOR Flash memory, Error Correction, Manhattan metrics, Soft Sensing
I. INTRODUCTION
T HE Flash memory devices store information in array of memory cells, with every cell (memory transistor) programmed to a certain level (value) of the threshold voltage V t . The modern technology offers two types of the Flash memory devices: NOR and NAND. The NAND memory operates by large chunks of data stored with density ∼ 10 11 bit/cm 2 and relatively high IBER ∼ 10 −2 . The NOR memory allows WRITE/READ of a single bit/byte; the data is stored with density ∼ 10 10 bit/cm 2 and relatively low IBER ∼ 10 −9 . The state of the art error correction codes (ECC) were developed for NAND memories, see e.g. [1] - [4] . These codes are capable to repair multiple errors and reduce the OBER to ∼ 10 −14 . The modern ECCs are not applicable to the NOR Flash memory devices, because the efficiency of the ECC increases with amount of the data written at once [2] , [5] which is above Kbyte for NAND. However, amount of the data written at once is single bit/byte for NOR. For example, the Hamming code correction of a single error in 4 cells storing 1 byte (2 bits/cell) requires 2 more redundant cells and 50% die size increase. Stringent requirements to OBER limit the scaling of the NOR Flash memory devices: die size gain due to scaling of the cell size is wasted for accommodation of redundant cells.
Angelo Visconti patented the idea of adding redundant program levels to the NOR Flash memory cell along with applying an error correction code [6] . In particular, he proposed to map the data to alphabet of size 5, write to 5 levels per cell, and protect the information by the relevant Hamming code. For example, 64 bits of the data are written into 28 cells, then 4 redundant cells allow correcting an error in any of 32 cells. This method preserves the density of 2 bits/cell, however it requires the WRITE operation of at least 8 bytes at once.
The READ error in Flash memory device occurs because of the overlap of V t distributions between neighbor program levels. The IBER ∼ 10 −10 in NOR Flash memory devices means that the overlap is weak, and the IBER is determined by the tail of the V t distribution. The effect of trapping and de-trapping of charges is responsible for an exponential shape of these tails [7] . The exponential shape of the V t distribution is typical for RTS noise, cycling effects on charge retention [8] , and cell interference [9] . The slope of the exponential distribution depends on many factors including channel doping [10] , the memory usage model, etc. This is quite opposite to the NAND, where IBER is high and the Gaussian V t distribution is adequate. This paper considers an alternative approach to error correction in the NOR Flash memory devices. The idea is to add program levels similarly to [6] , however to encode the information by maximizing the minimum Manhattan distance [11] . The Manhattan metrics is optimal for systems with exponential noise, because the error probability becomes the exponential function of the Manhattan distance between neighbor states. The method is closely related to non-binary coding in the Lee metrics [12] - [14] . Recent developments in polyomino (crosspolytopes) tiling make the idea attractive for Flash memory design [15] - [17] .
Below we present a calculation of the gain of storage reliability of eight bits in a system having 4 memory cells. As the reference we take the system with 4 program levels (two bits per cell, B 0 = 2). The additional 5th program level will increase the information capacity of the system to log 2 (5) = 2.3 bits per cell. Let the word x 1 . . . x 4 , where x j ∈ {0, 1, 2, 3, 4}, describe the state of the Flash memory system; the j-th cell is programmed to x j -th program level. ) is the random telegraph signal noise of the read current; these tails can also come from the charge loss, charge gain or charge sharing due to defects in the dielectric layers of the memory device. distance equal two [11] 
will reduce the information capacity to
Therefore, the modified system with 5 levels per cell and nonbinary coding will have enough information capacity to store 8 bits in 4 cells. The read of information from the j-th Flash memory cell is done by comparison of the device threshold voltage with the reference value stored in the reference memory cell and determination ofx j . With high probability the parity is satisfiedx 1 +x 2 +x 3 +x 4 = 0 mod 2, and the READ is correct; otherwise there is an error and soft sensing [18] for error correction is required. The periphery circuit measures the threshold voltages of all 4 cells and searches for nearest (in Manhattan distance) word satisfying (1). The probability of wrong error correction is of the same order as the probability of two errors, see calculations in Sec. III. Therefore, the logarithm of the inverse OBER can be increased as much as 50% by adding the 5th program level to the 4-level memory cell. As an example, the 4-level MLC with IBER ∼ 10 −9 , which is unacceptable for high density products, can be converted to OBER ∼ 10 −12 , see Table I .
II. THE BIT ERROR RATE OF AN UNPROTECTED SYSTEM
Observe N memory cells in a given memory device. Then program these N memory cells to predefined threshold voltage Fig. 1(a) . The program state A of memory system is
where the threshold voltage V j of the j-th memory cell is uniformly distributed around x j -th program level L xj
The width W of the program distribution is typically a function of the program speed. Faster programming leads to the wider program distribution and larger W . The READ operation of the Flash memory device cannot reproduce exactly the state A. The state is distorted by the read noise ( typically the RTS of the read current ) and by the data retention issues, see Fig. 1(b) . The periphery circuit reads the state
of the memory system. The distribution of the threshold voltage V j acquires the exponential tail
where T is the fraction of the cells in the tail, and 1/2a is the slope of the distribution.
The READ operation is simply comparison of the threshold voltage of the memory cell with the the threshold voltages of the reference cells. The threshold voltages of the reference cells is positioned in the middle of the level-to-level spacing. The READ operation finds values of the word {x j } from {V j }
where rounding is performed to the nearest integer. The probability of the read error of one cell is
The probability of error in reading of N cells per bit of information (IBER) is therefore
The key parameter here is a∆ 0 , which is the level spacing times the exponent of noise distribution. For systems with no error correction and large volumes of the data storage, E 0 ∼ 10 −12 is required.
III. THE BIT ERROR RATE OF FIVE LEVEL SYSTEM WITH NON-BINARY CODING
Let's program N = 4 memory cells to 5 predefined threshold voltage levels {L 0 , . . . , L 4 } see Fig. 2(a) . The program state A of memory system is
where the threshold voltage V j of the j-th memory cell is uniformly distributed around x j -th program level L xj , and the word {x j } satisfies (1). The READ operation becomes different from the margin sensing (7) . The READ will require soft sensing for the error correction: 1) Assume that the system read the state A = {V 1 , . . . , V 4 }. 2) Read the encoded word {x j } by margin sense (7). 3) If the parity condition (1) is satisfied forx j , then the READ operation was correct. The word {x j } is the READ of the encoded word {x j }. 4) If the parity condition (1) is not satisfied forx j , then the error correction is required. 5) Take all allowed states B = {L y1 , . . . , L y4 }, such that y j = x j ± 1, and define the distance
6) Find the state B, such that |AB| is minimum.
7) The word {y j } is the READ of the encoded word {x j }.
This is essentially "soft-decision sensing", as described in [18] . Three types of errors occur in this model. If the threshold voltages of two cells are found more than (∆ + W )/2 away from L x , then the parity check will pass, and the error will not be detected. The probability of this error and the corresponding OBER are
If the threshold voltage of j-th cell moves more than (2∆ + W )/2 away from L xj , then the error correction algorithm will converge to y j = x j ± 2, see Fig. 2(b) . The probability of this error and the corresponding OBER are
The third type of error occurs when
In terms of the read threshold voltages the condition for the error is
which is derived from (11) . In the relevant range
it becomes simplified to
The probability of this event is better expressed in terms of the deviation variables
The sum over pairs of cells and the normalization per number of stored bits gives
The total BER of five level design with non-binary ECC becomes
This must be compared with the probability E 0 of the error in the 4-level system given by (9) . Assuming that the 5th level was added without pushing out L 0 and L 3 , as in Figs. 1,2 , we get the condition
The generic scaling law for the OBER is (assume W ∆ 0 )
In practical calculation the factors T and e −aW are essential, see the next section. 
It is quite common in high reliability NOR Flash memory devices to have very low fraction of cells suffering from the data retention issues, T ∼1E-6, T e −aW . In this case
(25) The other possibility is to have relatively narrow program distributions for better read window, e −aW T :
In (25), (26) the spacing ∆ of 5-level system is expressed in terms of ∆ 0 and W by making use of (23). The proposed method yields a substantial gain in OBER performance for practical applications, see Table I . For estimation purpose we put W ∼ ∆ 0 and the OBER/IBER ratio becomes
and use this formula for typical values of T and e −a∆0 . The 5-level non-binary coding allows to reduce the OBER by 2-4 orders of magnitude. This method allows the technology progress from gigabit NOR parts to tenth gigabit products. The one-byte non-binary ECC discussed in this paper can be combined with global ECC coverage in cases where the data is streamed in many bytes.
For the systems with purely exponetial read noise, W = 0, T = 1, the scaling law (24) becomes accurate and can be used for OBER calculation. In this case the E 0 ∼ 10 −10 is reduced to E 2 ∼ 10 −15 . For the applications with the relatively high IBER, the 5-level method does not gain enough OBER. The reliability improvement can be achieved by using more complicated parity rules [11] with higher Manhattan distance [12] and more levels per cell. For example, the minimum Manhattan distance equal 3 can be achieved by packing the cross-polytopes of the radius 1. For the system of 4 memory cells, the volume of the 4-dimensional cross-polytope is 9. Therefore, the 7 levels per cell are required to preserve the information capacity of the system (1/4) log 2 (7 4 /9) = 2.014 > 2.
