Deep Learning-Aided Dynamic Read Thresholds Design For Multi-Level-Cell
  Flash Memories by Mei, Zhen et al.
ar
X
iv
:1
90
7.
03
93
8v
1 
 [c
s.I
T]
  9
 Ju
l 2
01
9
1
Deep Learning-Aided Dynamic Read Thresholds
Design For Multi-Level-Cell Flash Memories
Zhen Mei, Kui Cai, Senior Member, IEEE and Xuan He, Member, IEEE
Abstract—The practical NAND flash memory suffers from
various non-stationary noises that are difficult to be predicted.
Furthermore, the data retention noise induced channel offset
is unknown during the readback process. This severely affects
the data recovery from the memory cell. In this paper, we first
propose a novel recurrent neural network (RNN)-based detector
to effectively detect the data symbols stored in the multi-level-
cell (MLC) flash memory without any prior knowledge of the
channel. However, compared with the conventional threshold
detector, the proposed RNN detector introduces much longer read
latency and more power consumption. To tackle this problem,
we further propose an RNN-aided (RNNA) dynamic threshold
detector, whose detection thresholds can be derived based on the
outputs of the RNN detector. We thus only need to activate the
RNN detector periodically when the system is idle. Moreover,
to enable soft-decision decoding of error-correction codes, we
first show how to obtain more read thresholds based on the
hard-decision read thresholds derived from the RNN detector.
We then propose integer-based reliability mappings based on the
designed read thresholds, which can generate the soft information
of the channel. Finally, we propose to apply density evolution
(DE) combined with differential evolution algorithm to optimize
the read thresholds for LDPC coded flash memory channels.
Computer simulation results demonstrate the effectiveness of our
RNNA dynamic read thresholds design, for both the uncoded
and LDPC-coded flash memory channels, without any prior
knowledge of the channel.
Index Terms—MLC NAND flash memory, read threshold,
recurrent neural network, LDPC code, density evolution
I. INTRODUCTION
The NAND flash memory based solid state drives have
revolutionized the data storage industry. Compared with the
conventional hard disk drives, they offer lower power con-
sumption, faster write/read time, and higher reliability. How-
ever, the reliability of NAND flash memory is severely affected
by various noises such as the programming noise, the random
telegraph noise (RTN), and the cell-to-cell interference (CCI)
[1]. Most of these noises are difficult to be predicted due to the
complication of memory physics. Moreover, the data retention
noise caused by the charge leakage from the floating gate
over time will lead to a decrease of the threshold voltages
of the programmed cells [2]. The corresponding change of
the cell threshold voltages, referred to as the channel offset, is
unknown during the readback process, and hence will severely
degrade the memory sensing circuit’s performance if the read
thresholds still remain the same.
The material in this paper was presented in part at IEEE International
Conference on Communications (ICC), May 2019. Zhen Mei, Kui Cai and
Xuan He are with the Science and Math Cluster, Singapore University of
Technology and Design, Singapore 487372 (email: mei zhen@outlook.com;
cai kui@sutd.edu.sg; xuan he@sutd.edu.sg ).
To mitigate the memory cell errors caused by the various
noises and interference, the use of the error correction codes
(ECCs) is essential. In modern solid state disks, BoseChaud-
huriHocquenghem (BCH) codes with hard-decision decoding
(HDD) have been employed to correct multiple bit errors
[3, 4]. To further correct errors for the multi-level cell (MLC)
flash memory which is more error-prone than the single-level
cell (SLC) flash memory, low-density parity-check (LDPC)
codes with either HDD or soft-decision decoding (SDD) has
been adopted at the cost of longer decoding latency [5]. For
SDD of ECCs, the decoding performance depends heavily on
the accuracy of the log-likelihood rate (LLR) of the channel
coded bits, and generally more read thresholds will provide a
better estimation of the LLRs of the MLC flash memory.
In this paper, we consider the MLC NAND flash memory
which stores 2 bits user data per memory cell. The stored
data can be differentiated by the threshold voltage of each
memory cell. Specifically, a memory cell is configured to
one of four threshold voltage levels with mean voltages of
Vs11 , Vs10 , Vs00 , Vs01 , corresponding to the stored symbols of
11, 10, 00, 01, respectively. To read the data stored in a cell, its
voltage is measured and typically compared to predetermined
fixed read thresholds by the memory sensing circuit. As shown
by Fig. 1 (a), to differentiate four voltage levels, at least three
read thresholds a1, a2, a3 are needed. This will generate hard
outputs of the MLC flash channel that can support HDD of
ECCs. In order to generate the channel LLR to support SDD
of ECCs such as the LDPC code, more read thresholds are
required.
In the literature, several techniques have been proposed to
design the read thresholds [6–10]. For example, a “constant-
ratio” (CR) non-uniform quantization method was proposed
by observing that more errors occur in the overlapped areas
of adjacent distribution functions [6]. A widely accepted read
thresholds design approach was reported in [7], where the
quantization levels are optimized by maximizing the mutual
information (MMI) of the quantized channel. Based on min-
imizing the bit error rate (BER) and MMI, a parameter esti-
mation method and a dynamic programming framework were
proposed to design the read thresholds [8]. However, all these
approaches of designing the read thresholds assume that accu-
rate threshold voltages of the memory cells can be obtained for
different program/erase (P/E) cycles through proper modeling
and estimation of the flash memory channel. Furthermore, they
did not consider the unknown offset of the channel caused by
the data retention noise. Apart from the model-based approach,
read-retry and disparity-based thresholds approximation were
also proposed to dynamically adjust the read thresholds such
that errors can be corrected by the ECC [11]. However, the
read-retry scheme needs to be performed multiple times online
when the ECC fails which will lead to a longer latency and
more power consumption. The disparity-based method cannot
ensure near-optimal performance and needs to be invoked
frequently (e.g. daily) when the channel offset caused by data
retention is accumulated to a large level.
Constrained coding techniques, such as the rank modulation
[12], balanced codes [13], and the constant composition codes
[14], have also been proposed which can mitigate the unknown
offset of the channel through sorting the channel readback
signals. By leveraging on the balanced codes and the constant
composition codes, the dynamic threshold schemes [15, 16]
were proposed for both the SLC and MLC flash memories.
However, a major problem with all these schemes is the high
code rate loss incurred by the corresponding constrained codes.
Recently, the deep learning (DL) techniques have been
developed rapidly and they have shown superior performance
in many aspects of communication systems [17, 18]. In this
paper, we propose a novel DL-aided approach to design the
read thresholds dynamically for the MLC flash memories.
With this DL-based framework, all the unknown offset or
unpredictable variations of the flash memory channel can be
learned from the training data, thus avoiding the difficult task
of modeling of the practical flash memories.
In particular, we first propose a novel recurrent neural
network (RNN) detector to effectively detect data symbols
stored in the memory cells. Compared with the prior art
detection approaches, the RNN detector only requires the
training data rather than an accurate model of the flash memory
channel. In order to minimize the additional read latency and
power consumption incurred by the RNN detector, we further
propose an approach to derive the 3-level hard-decision read
thresholds a1, a2, a3 based on the output of the RNN detector.
In this way, the RNN detector only needs to be activated
periodically when the system is in the idle state. Once the
hard-decision thresholds are obtained, the RNN detector can be
terminated and the conventional threshold detector will still be
adopted by using the derived hard-decision thresholds until a
further adjustment of the read thresholds are needed. We name
such a detector the RNN-aided (RNNA) dynamic threshold
detector, which can support HDD of ECCs. Next, in order to
enable SDD of ECCs, we first show how to obtain more read
thresholds from our derived 3-level hard-decision read thresh-
olds. We then propose integer-based reliability mappings based
on the designed read thresholds, which can generate LLRs
of the flash memory channel without any prior knowledge of
the channel. Finally, to optimize the read thresholds in terms
of the decoding performance, we propose to apply density
evolution (DE) combined with differential evolution algorithm
for LDPC coded MLC flash memory channels. We remark that
our proposed RNNA dynamic read thresholds design can be
easily extended to the triple-level cell (TLC) or quad-level
cells (QLC) flash memories, with minor modifications of the
NN parameters and the update of the integer-based reliability
mappings.
The rest of the paper is organized as follows. In Section
II, we introduce the channel model for the MLC NAND flash
memory adopted by this work. They are only used to generate
data for training, testing, and evaluating the performance of
the NN-based detector. Our proposed RNN-aided (RNNA) dy-
namic read thresholds design does not have any knowledge of
the channel model. In Section III, we formulate the detection
of the flash memory channel as a machine learning problem
and propose a NN-based detector. In Section IV, we propose
novel RNNA dynamic read thresholds design, such that both
the hard and soft outputs (i.e. LLRs) of the flash memory
channel can be derived based on the outputs of the NN
detector. Extensive computer simulation results are illustrated
in Section V. Finally, Section VI concludes the paper.
II. CHANNEL MODEL
As extensively reported in the literature [6, 9, 19, 20], the
threshold voltage of the flash memory cell is mainly affected
by the programming noise, RTN, data retention noise, and
cell-to-cell interference (CCI).
A. Programming Noise
The threshold voltage of each memory cell can be pro-
grammed by injecting certain amount of charges into the
floating gate. The memory cell needs to be erased before it
can be programmed. Due to process variations, the threshold
voltage of erased cells is assumed to follow the Gaussian
distribution [21], given by
ps11(v) =
1√
2piσe
e
−
(v−Vs11 )
2
2σ2e , (1)
where σ2e is the variance of the erased cell’s threshold voltage.
Here, Vs11 denotes the mean threshold voltage of the erased
state, while Vs10 , Vs00 , Vs01 denote those of the programmed
cells that store different symbols, respectively. To program
the memory cell to these programmed voltage levels, the
incremental-step-pulse programming (ISPP) [22] scheme is
used, which leads to an uniform distribution of the voltage
levels for programmed cells [6], given by
pu(v) =
{
1
△Vpp
, Vp ≤ v ≤ Vp +△Vpp
0, otherwise,
, (2)
where Vp ∈ {Vs10 , Vs00 , Vs01}, and △Vpp denotes the in-
cremental program step voltage. Moreover, the programming
cells suffer from the programming noise, which is assumed
to be Gaussian distributed with zero mean and variance of
σ2p [19, 20, 23], with σ
2
p < σ
2
e . Note that by using a small
voltage step that realized by the programming control loop,
σ2p is significantly smaller than σ
2
e [24].
B. Random Telegraph Noise
In NAND flash memory, the repeated P/E cyclings lead to
the wear-out effect that damages the tunnel oxide of floating
gate transistors [25]. Such wear-out effect can be modeled
by the RTN. According to [26], the RTN tends to widen
the voltage distributions of the memory cells and leads to
exponential tails. In this paper, similar to [27], we model it as
a Gaussian distribution for mathematical tractability, given by
pw(v) =
1√
2piσw
e
− v
2
2σ2w , (3)
Voltage
Voltage
11 10 00 01
11 10 00 01
11s
V
10s
V
00s
V
01s
V
11s
V
10s
V
01s
V
00s
V
2a 3a1a
(a) Original distributions
(b) After Retention
Fig. 1. Threshold voltage distributions of MLC flash memory cells before
and after data retention noise.
where σw = 0.00027N
0.62
PE [19] is a function of NPE, which
denotes the number of P/E cycles.
C. Data Retention Noise
Data retention errors are the dominant errors in flash mem-
ory. They are caused by the leakage of the charges from
the floating gate over time, which leads to a decrease of the
threshold voltage. Following [20, 28, 29], the retention noise
is assumed to follow the Gaussian distribution, given by
prs(v) =
1√
2piσrs
e
−
(v−µrs )
2
2σ2rs , (4)
where µrs and σrs are state-dependent, given by [19]
µrs = (Vs − x0) · (AtNαiPE +BtNαoPE ) · ln(1 + T ), (5)
σrs = 0.3|µrs |, (6)
where s ∈ {s11, s10, s00, s01}, T is the retention time, and
the other parameters (x0, At, Bt, αi, αo) are constant. As the
retention time increases, there are mainly three effects on the
threshold voltage distributions of the erased and programmed
states. First, the threshold voltage distributions of all states
become wider. Second, the threshold voltage distributions of
the programmed states shift to that of the erased state. Third,
the shift of the higher-voltage states is larger than that of the
lower-voltage states.
Fig. 1 illustrates the threshold voltage distributions before
and after retention noise. Here, a1, a2, a3 are optimal hard-
decision read thresholds to differentiate different states with
respect to the original threshold voltage distributions. How-
ever, after data retention, it can be easily seen that these
read thresholds are no longer optimal and will lead to more
detection errors. Therefore, the read thresholds need to be
carefully designed in order to mitigate the data retention noise
induced errors.
D. Overall Threshold Voltage Distributions
The threshold voltage shift of one flash cell will influence
the threshold voltage of its adjacent cells within a block,
due to the parasitic capacitance-coupling effect [22]. Such
interference is referred to as the CCI, which is linearly added
to the threshold voltage of a victim cell. The threshold voltage
shift induced by CCI can typically be compensated by data
predistortion [30], hence we assume that the CCI has already
been removed in this work.
The overall threshold voltage distributions is usually ap-
proximated by the Gaussian mixture distribution [31, 32],
given by
ps10∼s01(v) =
1√
2piσs10∼s01
e
−
(v−µs10∼s01 )
2
2σ2s10∼s01 , (7)
where
µs11 = Vs11 − µrs11 ,
µs10∼s01 = Vs10∼s01 +
△Vpp
2
− µrs10∼s01 ,
σ2s11 = σ
2
e + σ
2
w + σ
2
rs11
,
σ2s10∼s01 = σ
2
p + σ
2
w + σ
2
rs10∼s01
. (8)
In the simulations of this work, we adopt the flash memory pa-
rameters provided by [19] and assume: Vs11 = 1.4, Vs10 = 2.6,
Vs00 = 3.2, Vs01 = 3.93, △Vpp = 0.2, σe = 0.35, σp = 0.05,
x0 = 1.4, At = 0.000035, Bt = 0.000235, αi = 0.62, and
αo = 0.3. With the probability density function (PDF) given
in (7), we are able to generate the readback threshold voltages
for an arbitrarily long input data sequence. We remark that the
channel model described above is only used to generate data
for training and testing the NNs. Our subsequently proposed
NN-based detection does not have any knowledge of this
channel model.
III. LEARNING TO DETECT
In this section, we formulate the detection of the flash
memory channel as a machine learning problem and propose a
NN-based detector. We denote the readback threshold voltage
of the k-th memory cell by vk . The inputs to the NN are
v = {v1, v2, · · · , vL}, where L is the number of neurons in the
input layer of the NN. The outputs of the NN are the soft es-
timates x˜ = {x˜1, x˜2, · · · , x˜L} of the labels x. For MLC flash
memory, there are four different states Vs11 , Vs10 , Vs00 , Vs01 ,
which can be represented by labels {0, 1, 2, 3}, respectively.
Hence, for the k-th cell, xk ∈ {0, 1, 2, 3}, the hard estimate
xˆk ∈ {0, 1, 2, 3} can be obtained by taking the nearest integer
of x˜k. Thereafter, the corresponding recovered bits c2k−1 and
c2k can be obtained by using the mapping {0, 1, 2, 3} →
{11, 10, 00, 01}.
The outputs x˜ of the NN can be considered as a function
of the NN’s inputs v and network’s parameters θ, given by
x˜ = f(v, θ). Then, the NN is trained to find the best θ∗ that
leads to a well-performed detector. To accomplish this task, a
loss function L is defined over the set of training data, such
that
θ
∗ = argmin
θ
L(x, x˜), (9)
where L(x, x˜) measures the loss between x˜ and x. By
using the gradient descent algorithm or its variants, combined
with the back propagation method, θ∗ can be obtained by
minimizing L(x, x˜) defined above over the training data set.
RNN cell
Dense 
Softplus
Dense 
Softplus
Dense 
Softplus
RNN cell
RNN cell
RNN cell
RNN cell
RNN cell
1
v
1
x%
2
v
L
v
2
x%
L
x%
Fig. 2. Proposed RNN architecture for the NN-based detection.
Note that as a supervised learning approach, the labels x
are known during the training. After the training process,
the network parameters θ∗ is obtained and another data
set named the validation set will be used to finetune the
model hyperparameters. After training and validation, the NN
detector will be employed to detect the unknown flash memory
channel outputs using the pre-determined NN.
A. Neural Network Architecture
In this paper, we propose to use a stacked RNN architecture
to perform the flash memory channel detection. The RNN is a
class of NNs with feedback connections. It is very suitable
to time series tasks, since it can use memories to process
sequences of inputs. There are different types of RNN cells,
namely, the vanilla RNN, gated recurrent unit (GRU) and long
short-term memory (LSTM). The number of network parame-
ters in vanilla RNN is significantly less than that of the GRU
and LSTM. However, it suffers from the vanishing/exploding
gradient problem, where the gradients may vanish to zero or
explode during the training process [33]. Hence, we employ
the GRU as the RNN cell, since it has less parameters than
the LSTM. Fig. 2 shows the proposed RNN architecture with
two GRU layers and one fully-connected output layer. For
each layer, an activation function is applied to introduce the
non-linearity to the NN. For the two GRU hidden layers, the
rectified linear unit (ReLU) activation function is adopted,
given by σrelu(t) = max {0, t} with σrelu(t) ∈ [0,∞) . For the
output layer, to obtain soft estimates of x, we use the softplus
activation function, which is the smooth approximation of the
ReLU function, given by σsoftplus(t) = ln(1 + exp(x)) with
σsoftplus(t) ∈ [0,∞).
B. Training Approach
To train the RNN, the readback threshold voltages v can
be obtained by sensing the current memory cell. In our work,
we generate the readback threshold voltages by simulating the
flash channel model described in Section II. As a supervised
learning approach, the label xk of vk is also required for the
training the RNN. To solve this problem, we use all codewords
that are decoded by the LDPC decoder as labels to train the
1 2 3 4 5 6 7 8
Epoch
3
4
5
6
7
8
9
10
SE
R
10-3
Fig. 3. The training SER of the RNN detector for each epoch at P/E cycles
NPE = 104 and retention time T = 102 hours.
RNN until a desired number of labels are collected. This
method may lead to a performance degradation due to the
existence of codewords that are incorrectly decoded by the
LDPC code. Note that it is possible to collect more accurate
labels by checking the syndrome of the decoded codeword, but
at the cost of additional latency. The performance comparison
of the RNN detector with correct and error corrupted labels
will be presented in Section V.
For the proposed RNN, to minimize the difference between
NN output x˜ and labels x, we define the loss function as the
mean square error (MSE) between x˜ and x, given by
L(x, x˜) = 1
L
L∑
k=1
(xk − x˜k)2. (10)
In our experiments, the number of neurons L in the input
layer is set to be 50 and this number can be reduced to further
simplify the network. After many trials, it is found that 3×106
training symbols are sufficient for the RNN to achieve its best
performance. For example, for an LDPC code with codeword
length 8000 bits, 750 recovered codewords are enough for
training the RNN.
The details of the RNN setting are summarized in Table
I. To illustrate the learning process, Fig. 3 shows the symbol
error rate (SER) of the RNN detector with correct labels for
each epoch during training. It is observed that the training SER
decreases sharply at the second epoch and converges from the
fourth epoch. It shows that the RNN can learn the channel
variations within only several epochs.
We remark that the proposed NN detection framework
is data-driven, which does not need the prior knowledge
of the channel. After training and validation, the proposed
RNN detector can successfully detect the readback signal v.
Furthermore, the above proposed NN detection framework can
be easily extended to the TLC or QLC flash memories, with
minor modifications to the NN parameters. For example, more
training data and hidden layers may be required to successfully
detect the readback signal.
TABLE I
NETWORK SETTINGS FOR THE PROPOSED RNN DETECTOR.
Training Symbols 3× 106
Mini-batch Size 100
Loss Function MSE
Initializer Xavier uniform initializer
Optimizer Adam optimizer
IV. NEURAL NETWORK-AIDED DYNAMIC READ
THRESHOLDS DESIGN
As will be illustrated in Section V, the proposed NN detector
can achieve near-optimal performance without any knowledge
of the channel. However, it needs to be activated for each
data block of length L for data detection. Although the
NN can be efficiently implemented in parallel with powerful
hardware such as the graphical processing units (GPU) or the
application-specific integrated circuit (ASIC), it will still lead
to substantially additional read latency and power consump-
tion. To avoiding using NN to detect every block of data, in
this section, we propose novel NNA dynamic read thresholds
design, such that both the hard and soft outputs (i.e. LLRs) of
the flash memory channel can be derived based on the outputs
of the NN detector.
A. Read Thresholds Design for Generating Hard Channel
Outputs
For the MLC flash memory channel, in order to generate
hard-outs of the channel to support HDD of ECCs, 3 read
threshold levels {a1, a2, a3} need to be determined. For a
given v and with assumed {a1, a2, a3}, we can obtain the hard
estimate x¯. Meanwhile, based on v, the RNN outputs x˜ and
hence xˆ. Therefore, the adjusted thresholds {a∗1, a∗2, a∗3} can
be obtained by searching for the thresholds that can minimize
the Hamming distance between x¯ and xˆ, denoted by d(x¯, xˆ).
We assume that M RNN output sequences with length L
are involved in the searching and denote them by Xˆ =
{xˆ1, xˆ2, . . . , xˆM}. We further have X¯ = {x¯1, x¯2, . . . , x¯M}
and V = {v1,v2, . . . ,vM}. Hence, we can obtain
{a∗1, a∗2, a∗3} = argmin
{a1,a2,a3}
d(X¯ , Xˆ). (11)
To get {a∗1, a∗2, a∗3}, we first uniformly quantize the search
space into m intervals (e.g. m = 1000), with boundaries
b0, b1, . . . , bm where b0 = −∞ < b1 = Vs11 < · · · < bm−1 =
Vs01 < bm = ∞. Note that larger m will result in higher
precision, but at the cost of higher computational complexity.
Hence, we have ai ∈ {b0, b1, . . . , bm}, i = 0, 1, . . . , n with
a0 = −∞ and an = ∞. In our case, we have n = 4, since
there are three read thresholds. To solve the problem in (11)
efficiently, a precomputation step is performed first. In this
step, the elements in V are first sorted in ascending order with
complexity O(ML log(ML)). Based on the corresponding Xˆ
obtained at the output of the RNN, we can get the number of
each symbol i that falls into [bj , bk), denoted by s(i, bj , bk),
with i = 0, 1, . . . , n − 1, 0 ≤ j < k ≤ m. Hence, (11) is
equivalent to
{a∗1, . . . , a∗n−1} = argmin
{a∗1 ,...,a
∗
n−1}
(
ML−
n−1∑
i=0
s(i, ai, ai+1)
)
.
(12)
Note that (12) can be further simplified to
{a∗1, . . . , a∗n−1} = argmax
{a∗1 ,...,a
∗
n−1}
n−1∑
i=0
s(i, ai, ai+1). (13)
Then, an exhaustive search method can be used to find
a∗1, . . . , a
∗
n−1 from b1, b2, . . . , bm−1 with computational com-
plexity O(mn−1). To further reduce the complexity, we pro-
pose to apply dynamic programming (DP) [34] to solve (13)
through a method similar to [35]. That is, let P (m′, n′)
(1 < n′ ≤ m′ ≤ m,) denote the problem of finding
a1, a2, . . . , an′−1 from b1, b2, . . . , bm′−1 such that C(m
′, n′)
is maximized, where C(m′, n′) =
∑n′−1
i=0 s(i, ai, ai+1) is
the objective function in (13). We denote C∗(m,n) as the
objective function of the optimal solution of P (m,n). Hence,
C∗(m,n) is given by
C∗(m,n)
= max
n−1≤λn−1<m
[
C∗(λn−1, n− 1) + s(n− 1, bλn−1 , bm)
]
,
= C∗(λ∗n−1, n− 1) + s(n− 1, bλ∗n−1, bm), (14)
where
{
λ∗1, . . . , λ
∗
n−1
}
collects the indices of b1, . . . , bm−1
and {bλ∗1 , . . . , bλ∗n−1} is the optimal solution of P (m,n). From
(14), the optimal solution of P (m,n) can be obtained by
solving its subproblems P (λn−1, n − 1), where n − 1 ≤
λn−1 < m. Similarly, P (λn−1, n−1) can also be solved by its
subproblems P (λn−2, n−2), where n−2 ≤ λn−2 < λn−1. In
this way, the optimal solution of P (m,n) can be calculated
in a recursive manner, such that DP can be employed [34].
According to (14), the complexity of DP is given by O(m2n).
Since n ≪ m, the complexity of DP is much lower than the
exhaustive search method. Note that when extending this work
to TLC or QLC flash memories, the exhaustive search method
is prohibitive since its complexity is O(mn−1), with n = 23
and n = 24 for TLC and QLC flash memories, respectively.
On the other hand, DP is always feasible to solve this problem,
since its complexity is always linearly proportional to m2.
We remark that the RNN detection and the subsequent
search of the read thresholds are only activated periodically
when the system is in the idle state, and will be terminated
once {a∗1, a∗2, a∗3} are obtained. Thereafter, the threshold de-
tector with the obtained read thresholds will still be adopted
until a further adjustment of the read thresholds is necessary.
For convenience, we name such a detector the RNNA dynamic
threshold detector. Hence, compared with the RNN detection
that is carried out for each data block, our proposed RNNA
dynamic threshold detector leads to a significant reduction of
the read latency and power consumption.
To evaluate the performance of the proposed RNN detector
and the RNNA dynamic threshold detector, we derive the
optimum symbol error probability (SEP) of the MLC flash
memory channel with the full channel knowledge, and use it
Ps =
∑
i
P (v = Vsi )P (e|v = Vsi)
=
1
4
(
P (v > a1|v = Vs11 ) + P (v < a1 ∪ v > a2|v = Vs10) + P (v < a2 ∪ v > a3|v = Vs00) + P (v < a3|v = Vs01 )
)
=
1
4
{
3 +Q
(
a1 − µs11
σs11
)
−Q
(
a1 − µs10
σs10
)
+Q
(
a2 − µs10
σs10
)
−Q
(
a2 − µs00
σs00
)
+Q
(
a3 − µs00
σs00
)
−Q
(
a3 − µs01
σs01
)}
.
(15)
   
* *

b

b

b

b

b

b

a

a

a
*
Fig. 4. Defining the non-uniform read thresholds b∗
1
, b∗
2
, . . . , b∗
6
based on
a∗
1
, a∗
2
, a∗
3
.
as the performance benchmark. Assuming four voltage states
{Vs11 , Vs10 , Vs00 , Vs01} are equiprobably stored in the memory
cells. Given read thresholds {a1, a2, a3} and based on the
channel model described in Section II, the SEP is given by
(15), where e is a symbol error and i ∈ {11, 10, 00, 01}. When
Gray mapping is used, the adjacent states only differ with 1
bit. Hence, the corresponding bit error probability (BEP) can
be approximated as Pb ≈ 0.5Ps.
The SEP in (15) is a function of {a1, a2, a3}, which is
a continuous function and is locally quasi-convex within
the range of our interest. Hence, we can find the optimum
read thresholds, and hence the minimum SEP by using the
Newton-Raphson method or other quasi-convex optimization
techniques [36]. The derived minimum SEP can serve as
the lower bound to evaluate the proposed detectors, whose
performance will be illustrated in Section V.
B. Read Thresholds Design for Generating Soft Channel Out-
puts
The 3-level hard-decision read thresholds derived above can
be used to differentiate four types of symbols stored in the
MLC flash memory cell. In order to performance SDD, more
read thresholds are needed. In this subsection, we first show
how to design more read thresholds to support SDD by using
the 3-level hard-decision read thresholds designed earlier. We
use the 6-level read thresholds as an example to generate the
soft channel outputs, although the proposed thresholds design
method can be generalized to more number of read thresholds.
We then propose integer-based reliability mappings based on
the designed read thresholds, which can generate LLRs of
the flash memory channel without any prior knowledge of the
channel. Moreover, to optimize the read thresholds in terms of
the decoding performance, we propose to apply DE combined
with differential evolution algorithm for LDPC coded MLC
flash memory channels.
1) Soft-Decision Read Thresholds Design: As illustrated by
Fig. 4, in flash memories, the dominant overlapping regions
of two adjacent distribution functions are around the hard-
decision read thresholds, where more errors will occur. Hence,
it is natural to sense this region with a higher precision and
the remaining region with a lower precision. Therefore, we
propose to adopt non-uniform read thresholds {b∗1, b∗2, . . . , b∗6}
based on the hard-decision read thresholds {a∗1, a∗2, a∗3} we
derived earlier. In particular, to find {b∗1, b∗2, . . . , b∗6}, we first
define three regionsR1, R2, R3, whose centers are a∗1, a∗2, a∗3,
respectively. Assume the widths of R1, R2, R3 are W1, W2,
W3, respectively. Then, we propose to obtain the boundaries
{b∗1, b∗2, . . . , b∗6} by
{
b∗2i−1 = a
∗
i −Wi/2,
b∗2i = a
∗
i +Wi/2,
, (16)
where a∗i is the center of each interval [b
∗
2i−1, b
∗
2i] with
i = 1, 2, 3. In this way, by using (16), the optimization of
{b∗1, b∗2, . . . , b∗6} is converted to the problem of optimization of
{W1, W2, W3}, which is a much easier task. The brute force
way to optimize W1, W2 and W3 is by computer exhaustive
search through Monte-Carlo simulations so as to minimize the
error rate performance of the system. However, for specific
ECCs, such as the LDPC code, we can use theoretical analysis
to replace the error rate simulations for optimizing the values
of W1, W2 and W3. The corresponding details are presented
in Section IV.B.3).
2) Integer-based Reliability Mappings: With the designed
read thresholds, the next step is to derive LLRs for SDD.
Frist, as a reference, we show how to obtain the LLR with
the designed read thresholds {b∗1, b∗2, . . . , b∗6}, by using the
full knowledge of the channel. Each memory cell has four
possible voltage states and stores two bits, we refer the left
bit as the most significant bit (MSB) and the right bit as the
least significant bit (LSB). Define Tj = [bj, bj+1) as the j-
th quantization interval, j = 0, 1, . . . , 6, with b0 = −∞ and
b7 = ∞. Then, for a given threshold voltage v ∈ Tj and
assume the channel PDF of (7) is known, the LLR of the
MSB and the LSB are given by
LMSB = ln
Pr(v ∈ Tj |MSB = 0)
Pr(v ∈ Tj |MSB = 1)
= ln
∫
Tj
{ps00(v) + ps01(v)}dv∫
Tj
{ps10(v) + ps11(v)}dv
(17)
TABLE II
THE PROPOSED MAPPINGS BETWEEN THE QUANTIZATION LEVEL THAT THE CHANNEL READBACK SIGNAL BELONGS TO AND THE INTEGER-BASED
RELIABILITY
[b∗
0
, b∗
1
) [b∗
1
, b∗
2
) [b∗
2
, b∗
3
) [b∗
3
, b∗
4
) [b∗
4
, b∗
5
) [b∗
5
, b∗
6
) [b∗
6
, b∗
7
)
LˆMSB −3 −2 −1 0 1 2 3
LˆLSB −1 0 1 2 1 0 −1
and
LLSB = ln
Pr(v ∈ Tj|LSB = 0)
Pr(v ∈ Tj|LSB = 1)
= ln
∫
Tj
{ps00(v) + ps10(v)}dv∫
Tj
{ps01(v) + ps11(v)}dv
, (18)
where ∫
Tj
ps10∼s01(v)dv = Q
(
bj − µs10∼s01
σs10∼s01
)
−Q
(
bj+1 − µs10∼s01
σs10∼s01
)
. (19)
However, in practical flash memories, due to the various
noises/interference and especially the data retention noise, the
accurate PDF of each voltage state is not available to the
channel detector. Hence, accurate calculation of the LLRs
given by (17) and (18) is not possible. However, it has
been shown that the reliability of a received symbol can be
measured by its magnitude and this measure can be quantized
to integers for the additive white Gaussian noise (AWGN)
channel [37] and the single-level-cell NVM channel [38]. In
this paper, we propose integer-based reliability mappings for
SDD of ECCs over the MLC flash memory channel.
In particular, as shown by Fig. 4, in the dominant overlapped
regionsR2, we are the least confident about whether the MSB
is a ‘0’ or a ‘1’. Hence, if the readback signal v ∈ R2, set
LˆMSB = 0. Similarly, if v ∈ R1 or v ∈ R3, set LˆLSB = 0. If
v is farther away from these dominant overlapped regions, we
are more confident about whether the MSB is a ‘0’ or a ‘1’.
Similar trend holds for the LSB. Based on these observations,
it is natural to given a mapping between the quantization level
that the channel readback signal belongs to, and the integer-
based reliability LˆMSB (LˆLSB) as shown by Table II. Note that
the mappings we propose in Table II may be not optimal
and better mappings could be found to further enhance the
performance.
3) LDPC Code-Specific Optimization of the Read Thresh-
olds : With the above proposed mappings, the SDD of ECCs
can then be performed. In this work, we consider the LDPC
codes, which have already been widely applied to MLC flash
memory channels [1]. Hence the above proposed integer-
based reliability measure can be fed into the decoder of the
LDPC codes. In this work, the normalized min-sum (NMS)
decoding algorithm is employed since it can closely approach
the performance of the sum-product algorithm (SPA) with a
much lower computational complexity.
In this subsection, to optimize the LDPC-coded performance
over the MLC flash memory channel, W1, W2 and W3 are
jointly optimized for the LDPC codes. To evaluate the decod-
ing performance, the DE analysis can be employed to derive
the decoding threshold of LDPC ensembles. The ensemble of
LDPC codes can be characterized by the degree distributions.
A regular LDPC ensemble is defined by (dv, dc), where dv and
dc are the number of edges connected to each variable node
and check node, respectively. An irregular LDPC ensemble
has non-uniform variable node and check node degrees, hence
it can be defined by edge degree distributions λ(x) and ρ(x),
given by
λ(x) =
∑
j≥2
λjx
j−1, ρ(x) =
∑
i≥2
ρix
i−1, (20)
where λj and ρi are the fraction of edges that are connected
to variable and check nodes with degree j and i, respectively.
Since the MLC flash memory channel is asymmetric, to
enable DE, the channel symmetrizing method proposed in [39]
is employed. First, for a given set of widths W1, W2, W3 of
R1, R2, R3, the integer-based reliability LˆMSB (LˆLSB) of the
readback signal v can be obtained based on the mappings given
by Table II. We then define f0(Lˆ) and f1(Lˆ) as the PDFs of Lˆ
corresponding to the originally stored bit of c = 0 and c = 1,
respectively. Note that f0(Lˆ) and f1(Lˆ) can be obtained using
a histogram approach.
According to the channel symmetrizing method [39], we can
flip all the signs of LLRs with x = 0. Due to the symmetry of
the processing rules, the signs of messages that enter or exit the
variable nodes are also flipped. Hence, the DE for a particular
codeword is equivalent to that for the all-one codeword. The
L-density after channel symmetrizing is given by
fs(Lˆ) =
1
2
(
f0(−Lˆ) + f1(Lˆ)
)
. (21)
Then, the L-density f s(Lˆ) can be used to initialize the DE
algorithm. The DE of the NMS algorithm is an iterative
algorithm which consists of the evolution of the L-densities
for the check node and variable node updates. The original DE
of the NMS algorithm is presented in [40]. In this work, we
employ the discrete DE (DDE) [41] to reduce the complexity
of DE by quantizing all input and output messages during
decoding. Let the u(l) be the message from a degree-dc check
node to a variable node at the l-th iteration, and v(l) be
the output message of a degree-dv variable node at the l-th
iteration. Under the NMS algorithm with normalization factor
α, the check node update and variable node update are given
1 1.2 1.4 1.6 1.8 2
P/E Cycles 104
1.6
1.65
1.7
1.75
1.8
1.85
1.9
1.95
M
ut
ua
l I
nf
or
m
at
io
n 
(bi
ts/
ce
ll)
MMI, 3-level Read Thresholds
MMI, 6-level Read Thresholds
RNNA, 3-level Read Thresholds
RNNA, 6-level Read Thresholds
Fig. 5. The MI with the MMI quantizer and RNNA quantizer over different
P/E cycles at T = 104 hours.
by
u(l) =

dc−1∏
j=1
sign
(
v
(l−1)
j
) · α · min
j∈{1,2,...,dc−1}
|v(l−1)j |,
v(l) = v(0) +
dv−1∑
i=1
u
(l)
i , (22)
respectively, where α is the normalization factor, v
(l)
j , j =
1, 2, . . . , dc − 1 are the incoming messages from neighbors
of a degree-dc check node, and u
(l)
i , i = 1, 2, . . . , dv − 1
are the incoming messages from neighbors of a degree-dv
variable node at the l-th iteration. The initial message of
the algorithm is v0 = Lˆ. According to the quantized NMS
algorithm, u(l) and v(l) are quantized to u¯(l) and v¯(l). We
denote the probability mass function (PMF) of u¯(l) and v¯(l)
by P
(l)
u¯ and P
(l)
v¯ , and they can be calculated using the DDE
approach given in [41]. According to the DDE algorithm, the
fraction of incorrect messages for the l-th iteration is given by
P (l)e =
∑
k<0
P
(l)
v¯ [k]. (23)
As described earlier, the initial integer-reliability Lˆ, and
hence the L-density is determined byW1,W2,W3. Therefore,
the choice ofW1,W2 andW3 will directly affect P (l)e given by
(23). Differential evolution is an global optimization algorithm
that does not rely on any assumption of the problem [42].
In this work, we apply the differential evolution algorithm
to find the a set of W1, W2 and W3 so as to optimize the
decoding performance. The cost function is P
(l)
e and hence
the optimized {W∗1 ,W∗2 ,W∗3} are given by
{W∗1 ,W∗2 ,W∗3} = argmin
{W1,W2,W3}
P (l)e , (24)
with given channel parameters NPE, T and the number of
iterations (e.g. l = 10). The details of differential evolution
optimization process are presented in [42].
After deriving the optimizedW∗1 ,W∗2 ,W∗3 , the correspond-
ing read thresholds {b∗1, b∗2, . . . , b∗6} can be obtained according
0.0005 0.001 0.005 0.01 0.02 0.04
BER of Labels
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.02
Te
st
 S
et
 B
ER
Training with error corrupted labels
Training with correct labels
Fig. 6. The test set BER using correct and error corrupted labels respectively
at NPE = 1.1× 104 and T = 104 hours.
3000 5000 7000 9000 11000 13000 15000
P/E Cycles
10-4
10-3
10-2
10-1
BE
R
with original threshold detector
with RNN detector
with RNNA dynamic threshold detector
with optimum threshold detector
Fig. 7. BER of the RNN detector and RNNA dynamic threshold detector at
T = 104 hours.
to (16). We remark that for extending this work to TLC or QLC
flash memories, the above described RNNA dynamic read
thresholds design method is still valid. The only difference
lies in the design of the mappings between the integer based
reliability and the quantization level that the channel readback
signal belongs to, since more read thresholds are required for
TLC and QLC flash memories.
Fig. 5 compares the MI of the flash memory channel with
the MMI quantizer [7] and the RNNA quantizer proposed in
this subsection. It is observed that the MI with the proposed
RNNA quantizer is very closed to that with the MMI quantizer
for both the 3-level and 6-level read thresholds. Note that
that for the case with the MMI quantizer, it is assumed that
threshold voltage distributions of the flash memory channel is
known which is unrealistic, while for the case with the RNNA
quantizer, there is no prior knowledge of the channel at all,
which is consistent with the practical flash memories.
V. PERFORMANCE EVALUATIONS
In this work, the implementation and training of the RNN
are performed using the machine learning library Keras [43],
0.5 0.7 0.9 1.1 1.3 1.5
P/E Cycles 104
10-4
10-3
10-2
10-1
BE
R
Original threshold detector
RNNA dynamic threshold detector, training without mismatch
RNNA dynamic threshold detector, NPE = 1000, T = 0
RNNA dynamic threshold detector, NPE = 2000, T = 0
RNNA dynamic threshold detector, NPE = 3000, T = 0
Optimum threshold detector
Fig. 8. BER comparison of different detectors. The RNNA dynamic threshold
detector is trained with P/E cycles mismatch (△NPE = 1000, 2000, 3000)
and at T test = 104 hours.
5 103 104 2 104 5 104 5 104
Retention time
10-2
10-1
BE
R
Original threshold detector
RNNA dynamic threshold detector, training without mismatch
RNNA dynamic threshold detector, NPE =1000, T = 1 month
RNNA dynamic threshold detector, NPE =1000, T = 3 months
RNNA dynamic threshold detector, NPE =1000, T = 6 months
Optimal threshold detector
Fig. 9. BER comparison of different detectors. The RNNA dynamic threshold
detector is trained with both P/E cycles and retention time mismatch, with
N testPE = 10
4.
with TensorFlow [44] as the back-end. As described in Section
IV.A, to reduce the read latency and power consumption, the
RNN is trained periodically when the system is in the idle
state. However, there might be a mismatch of P/E cycles
and/or the retention time between the training set and test
set. To measure the performance of the RNN detector and
RNNA dynamic threshold detector in the presence of training
and testing mismatch, we define the mismatch of P/E cycles
as △NPE = N testPE − N trainPE , where N trainPE and N testPE are the
numbers of P/E cycles during training and testing, respec-
tively. Similarly, we define the mismatch of the retention time
as △T = T test − T train. In the following subsections, the
uncoded/raw channel BER and LDPC-coded BER for MLC
flash memory channels with our proposed RNNA quantizers
are investigated.
A. Uncoded BER Performance
First, to evaluate the effect of error corrupted labels on the
performance of the RNN detector, we consider two cases:
3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000
P/E cycles
10-6
10-5
10-4
10-3
10-2
BE
R
RNNA quantizer, training without mismatch
RNNA quantizer, NPE = 1000, T = 0
RNNA quantizer, NPE = 2000, T = 0
RNNA quantizer, NPE = 3000, T = 0
MMI quantizer, original thresholds
MMI quantizer, with full channel knowledge
Fig. 10. BERs of LDPC codes with the RNNA quantizer, trained with differ-
ent P/E cycles mismatch (△NPE = 1000, 2000, 3000), at T test = 1 × 104
hours.
training with correct labels and training with error corrupted
labels. As shown by Fig. 6, with a high LDPC decoding error
rate of 5 × 10−3, the RNN detector can still approach the
performance of the RNN detector trained with correct labels.
This indicates that the RNN detector is robust to erroneous
labels.
Fig. 7 shows the BER performance of MLC flash memory
channel using the RNN detector and RNNA dynamic threshold
detector with thresholds obtained using (11). The optimum
threshold detector we derived in Section IV.A is also included
as a reference. Observe that the BER performance with the
RNN detector is much better than that with the detector using
the original thresholds (for NPE = 0, T = 0), and can closely
approach the BERs of the optimum threshold detector. Further-
more, the proposed RNNA dynamic threshold detector even
slightly outperforms the RNN detector and almost achieves the
the performance of the optimum detector. Therefore, in the
following simulations, we only employ the RNNA dynamic
threshold detector, instead of the RNN detector.
Fig. 8 depicts the BERs of the RNNA dynamic threshold
detector, where the RNN is trained with different amount of
mismatch of P/E cycles between the training set and test set.
Observe that when the mismatch △NPE is less than 1000, the
BER performance of the RNNA dynamic threshold detector
can still closely approach the optimum performance. As the
mismatch further increases, performance gap between the
RNNA dynamic threshold detector and the optimum threshold
detector becomes larger. In particular, this gap becomes signif-
icant at low BER regions when △NPE = 3000. This indicates
that we only need to activate the RNN detector as well as
the search of the adjusted read threshlds when the P/E cycle
mismatch is greater than 3000.
To investigate the performance of the RNNA dynamic
threshold detector in the presence of different retention time
mismatch, we fix the mismatch of P/E cycles at 1000 and
vary the the mismatch of retention time. As illustrated by Fig.
9, when the retention time mismatch is 1 month, the RNNA
dynamic threshold detector approaches the performance of
the optimum detector. When the retention time mismatch
0.3 0.5 1 1.5 2 2.5 3
Retention time (hours) 104
10-6
10-5
10-4
10-3
10-2
10-1
BE
R
RNNA quantizer, training without mismatch
RNNA quantizer, NPE = 1000, T = 1 week
RNNA quantizer, NPE = 1000, T = 1 month
RNNA quantizer, NPE = 1000, T = 3 months
MMI quantizer, original thresholds
MMI quantizer, with full channel knowledge
Fig. 11. BERs of LDPC codes with the RNNA quantizer, trained with both
P/E cycles and retention time mismatch (△NPE = 1000,△T = 1 week, 1
month and 3 months), and at N testPE = 1.1× 10
4.
increases to 3 months, the RNNA dynamic threshold detector
only shows slight performance degradation. However, as the
mismatch further increases to 6 months, a severe error floor
occurs at T = 5 × 103 hours. This indicates that we only
need to activate the RNN detector as well as the search of the
adjusted read thresholds every 3 months to maintain the near
optimum BER performance.
B. LDPC-coded BER Performance
In this subsection, we investigate the LDPC-coded perfor-
mance using our proposed RNNA quantizer and the MMI
quantizer. For cases with the MMI quantizer, we make the
unrealistic assumption that the channel PDFs are known, and
use the corresponding BERs as the performance benchmark.
In the practical flash memories with the RNNA quantizer,
the channel is unknown and the LLR is calculated using our
proposed integer-based reliability mappings given in Table II.
A rate 0.93 regular (5, 69) LDPC code is employed with
codeword length N = 8832 bits and information length
K = 8196 bits, respectively. The LDPC codes is constructed
by the progressive edge-growth (PEG) algorithm and decoded
by using the NMS decoding algorithm with normalization
factor α = 0.5. The maximum number of decoding iterations
is 10.
Fig. 10 shows the BER comparison of LDPC codes with the
RNNA quantizer, trained with difference P/E cycles mismatch,
and at T test = 1 × 104 hours. We first observe that the
BER performance using the MMI quantizer with the original
thresholds (for NPE = 0, T = 0) is much worse than that
using the other quantizers. This implies that it is essential to
update the read thresholds to maintain the error correction
capability. When training without mismatch, the LDPC-coded
performance with the RNNA quantizer approaches to that
with the MMI quantizer designed using the full channel
knowledge. For cases with the P/E cycle mismatch, the BERs
with the RNNA quantizer is close to those with the MMI
quantizer, up till to a P/E cycle mismatch of △NPE = 2000.
This demonstrates the robustness of the LDPC coded system
with the RNNA quantizer with respect to the training P/E
cycle mismatch. However, when the mismatch continues to
increases, more performance degradation occurs. In particular,
when △NPE increases to 3000, the LDPC-coded performance
is about 1000 P/E cycles worse than the that without training
mismatch. In addition, it is also observed that the LDPC coded
performance shown by Fig. 10 is consistent with the threshold
detectors’ performance illustrated by Fig. 8.
Next, we further vary the retention time and consider the
case that both the mismatch of P/E cycles and retention time
between the training set and test set exists in the LDPC coded
system. Observe from Fig. 11 that the BER performance of
LDPC codes with the RNNA quantizer can also approach
that with the MMI quantizer over a wide range of retention
time. For the RNNA quantizer, with a P/E cycle mismatch of
△NPE = 1000, the LDPC code’s performance degradation is
small. However, when the retention time mismatch increases
to 3 months, the BER performance is significantly degraded
when the retention time is less than 5000 hours. Again, we
observe that the LDPC coded performance shown by Fig. 11 is
consistent with the threshold detectors’ performance illustrated
by Fig. 9.
VI. CONCLUSIONS
We have considered to tackle the various non-stationary
noises and the unknown offset of the NAND flash mem-
ory channels by using the DL techniques. In particular, we
have first proposed a novel RNN-based detector, which can
effectively detect the data symbols stored in the memory
cell without any prior knowledge of the channel. However,
compared with the conventional threshold detector, the RNN
detector will lead to much longer read latency and more power
consumption. To tackle this problem, we have proposed an
RNNA dynamic threshold detector, whose detection thresholds
can be derived based on the outputs of the RNN detector.
In this way, we only need to activate the RNN detector
periodically when the system is in the idle state. To enable
SDD of ECCs, more read thresholds are required to generate
the LLRs of the channel. In this work, we have first shown how
to obtain more read thresholds based on the hard-decision read
thresholds derived from the RNN detector. We then proposed
integer-based reliability mappings based on the designed read
thresholds, which can generate LLRs of the flash memory
channel. Finally, to optimize the read thresholds in terms of
the decoding performance, we have proposed to apply DE
combined with differential evolution algorithm for the LDPC
coded flash memory channels. Simulation results have shown
that the BER performance of our proposed RNNA dynamic
threshold detector can almost achieve that of the optimum
threshold detector designed with the full knowledge of the
channel. The BERs of the LDPC-coded system with the pro-
posed RNNA dynamic read thresholds can closely approach
that of the MMI quantizer which requires the full knowledge
of the channel. We have also found that the proposed RNNA
dynamic read thresholds design is robust to the training set
and test set mismatch. Moreover, it only needs to be activated
every few months. Hence it shows a high potential for practical
applications.
REFERENCES
[1] Y. Cai, S. Ghose, E. F. Haratsch, Y. Luo, and O. Mutlu, “Error
characterization, mitigation, and recovery in flash-memory-
based solid-state drives,” Proceedings of the IEEE, vol. 105,
no. 9, pp. 1666–1704, 2017.
[2] Y. Cai, E. F. Haratsch, O. Mutlu, and K. Mai, “Error patterns
in mlc nand flash memory: Measurement, characterization, and
analysis,” in Proc. DATE, 2012.
[3] Y. Lee, H. Yoo, I. Yoo, and I.-C. Park, “6.4 gb/s multi-threaded
bch encoder and decoder for multi-channel ssd controllers,” in
Proc. ISSCC, 2012.
[4] ——, “High-throughput and low-complexity BCH decoding
architecture for solid-state drives,” IEEE Trans. Very Large
Scale Integr. (VLSI) Syst., vol. 22, no. 5, pp. 1183–1187, 2014.
[5] K. Zhao, W. Zhao, H. Sun, X. Zhang, N. Zheng, and T. Zhang,
“LDPC-in-SSD: Making advanced error correction codes work
effectively in solid state drives,” in Proc. FAST, 2013.
[6] G. Dong, N. Xie, and T. Zhang, “On the use of soft-decision
error-correction codes in nand flash memory,” IEEE Trans.
Circuits Syst. I, Reg. Papers, vol. 58, no. 2, pp. 429–439, 2011.
[7] J. Wang, K. Vakilinia, T.-Y. Chen, T. Courtade, G. Dong,
T. Zhang, H. Shankar, and R. Wesel, “Enhanced precision
through multiple reads for LDPC decoding in flash memories,”
IEEE J. Sel. Areas Commun., vol. 32, no. 5, pp. 880–891, May
2014.
[8] B. Peleato, R. Agarwal, J. M. Cioffi, M. Qin, and P. H.
Siegel, “Adaptive read thresholds for nand flash,” IEEE Trans.
Commun., vol. 63, no. 9, pp. 3069–3081, 2015.
[9] C. A. Aslam, Y. L. Guan, and K. Cai, “Read and write
voltage signal optimization for multi-level-cell (mlc) nand flash
memory,” IEEE Trans. Commun., vol. 64, no. 4, pp. 1613–1623,
2016.
[10] Z. Mei, K. Cai, and L. Shi, “Information theoretic bounds based
channel quantization design for emerging memories,” in Proc.
IEEE ITW, Nov. 2018.
[11] Y. Cai, Y. Luo, E. F. Haratsch, K. Mai, and O. Mutlu, “Data
retention in MLC NAND flash memory: Characterization, op-
timization, and recovery,” in Proc. IEEE HPCA, 2015.
[12] A. Jiang, R. Mateescu, M. Schwartz, and J. Bruck, “Rank mod-
ulation for flash memories,” IEEE Trans. Inf. Theory, vol. 55,
no. 6, pp. 2659–2673, 2009.
[13] L. G. Tallini, R. M. Capocelli, and B. Bose, “Design of some
new efficient balanced codes,” IEEE Trans. Inf. Theory, vol. 42,
no. 3, pp. 790–802, 1996.
[14] K. A. S. Immink and K. Cai, “Composition check codes,” IEEE
Trans. Inf. Theory, vol. 64, no. 1, pp. 249–256, 2017.
[15] H. Zhou, A. Jiang, and J. Bruck, “Error-correcting schemes with
dynamic thresholds in nonvolatile memories,” in Proc. IEEE
ISIT, 2011.
[16] F. Sala, R. Gabrys, and L. Dolecek, “Dynamic threshold
schemes for multi-level non-volatile memories,” IEEE Trans.
Commun., vol. 61, no. 7, pp. 2624–2634, 2013.
[17] T. OShea and J. Hoydis, “An introduction to deep learning for
the physical layer,” IEEE Trans. Cogn. Commun. Netw, vol. 3,
no. 4, pp. 563–575, 2017.
[18] T. Gruber, S. Cammerer, J. Hoydis, and S. ten Brink, “On deep
learning-based channel decoding,” in In Proc. IEEE CISS, Mar.
2017.
[19] G. Dong, N. Xie, and T. Zhang, “Enabling nand flash memory
use soft-decision error correction codes at minimal read latency
overhead,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 60,
no. 9, pp. 2412–2421, 2013.
[20] H. Wang, T.-Y. Chen, and R. D. Wesel, “Histogram-based flash
channel estimation,” in Proc. IEEE ICC, Jun. 2015.
[21] K. Takeuchi, T. Tanaka, and H. Nakamura, “A double-level-
v/sub th/select gate array architecture for multilevel nand flash
memories,” IEEE J. Solid-State Circuits, vol. 31, no. 4, pp. 602–
609, 1996.
[22] J.-D. Lee, S.-H. Hur, and J.-D. Choi, “Effects of floating-gate
interference on NAND flash memory cell operation,” IEEE
Electron Device Lett., vol. 23, no. 5, pp. 264–266, 2002.
[23] G. Dong, Y. Pan, and T. Zhang, “Using lifetime-aware progres-
sive programming to improve SLC NAND flash memory write
endurance,” IEEE Transactions on Very Large Scale Integration
(VLSI) Systems, vol. 22, no. 6, pp. 1270–1280, 2014.
[24] C. M. Compagnoni, A. Spinelli, R. Gusmeroli, A. L. Lacaita,
S. Beltrami, A. Ghetti, and A. Visconti, “First evidence for
injection statistics accuracy limitations in NAND flash constant-
current fowler-nordheim programming,” in Proc. IEEE Int.
Electron Devices Meeting (IEDM), 2007.
[25] N. Mielke, H. Belgal, I. Kalastirsky, P. Kalavade, A. Kurtz,
Q. Meng, N. Righos, and J. Wu, “Flash EEPROM threshold in-
stabilities due to charge trapping during program/erase cycling,”
IEEE Trans. Device Mater. Rel., vol. 4, no. 3, pp. 335–344,
2004.
[26] C. M. Compagnoni, M. Ghidotti, A. L. Lacaita, A. S. Spinelli,
and A. Visconti, “Random telegraph noise effect on the pro-
grammed threshold-voltage distribution of flash memories,”
IEEE Electron Device Lett., vol. 30, no. 9, pp. 984–986, 2009.
[27] C. A. Aslam, Y. L. Guan, and K. Cai, “Decision-directed
retention-failure recovery with channel update for MLC NAND
flash memory,” IEEE Trans. Circuits Syst. I, Reg. Papers,
vol. 65, no. 1, pp. 353–365, 2018.
[28] T.-Y. Chen, A. R. Williamson, and R. D. Wesel, “Increasing
flash memory lifetime by dynamic voltage allocation for con-
stant mutual information,” in Proc. Inf. Theory Appl. Workshop,
Feb. 2014.
[29] G. Dong, Y. Pan, N. Xie, C. Varanasi, and T. Zhang, “Estimating
information-theoretical nand flash memory storage capacity and
its implication to memory system design space exploration,”
IEEE Trans. VLSI Syst., vol. 20, no. 9, pp. 1705–1714, 2012.
[30] G. Dong, S. Li, and T. Zhang, “Using data postcompensation
and predistortion to tolerate cell-to-cell interference in MLC
NAND flash memory,” IEEE Trans. Circuits Syst. I, Reg. Papers,
vol. 57, no. 10, pp. 2718–2728, 2010.
[31] Y. Cai, E. F. Haratsch, O. Mutlu, and K. Mai, “Threshold
voltage distribution in mlc nand flash memory: Characterization,
analysis, and modeling,” in Proc. DATE, Mar. 2013.
[32] D.-h. Lee and W. Sung, “Estimation of nand flash memory
threshold voltage distribution for optimum soft-decision error
correction,” IEEE Trans. Signal Process., vol. 61, no. 2, pp.
440–449, 2013.
[33] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep
learning. MIT press Cambridge, 2016, vol. 1.
[34] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein,
Introduction to Algorithms, Third Edition, 3rd ed. The MIT
Press, 2009.
[35] X. He, K. Cai, W. Song, and Z. Mei, “Dynamic programming
for discrete memoryless channel quantization,” arXiv preprint
arXiv:1901.01659, 2019.
[36] S. Boyd and L. Vandenberghe, Convex optimization. Cam-
bridge university press, 2004.
[37] H. Chen, K. Zhang, X. Ma, and B. Bai, “Comparisons between
reliability-based iterative min-sum and majority-logic decoding
algorithms for ldpc codes,” IEEE Trans. Commun., vol. 59,
no. 7, pp. 1766–1771, 2011.
[38] K. Cai, Z. Qin, and B. Chen, “Channel capacity and soft-
decision decoding of LDPC codes for spin-torque transfer mag-
netic random access memory (STT-MRAM),” in Proc. IEEE
ICNC, Jan. 2013.
[39] T. Richardson and R. Urbanke, Modern coding theory. Cam-
bridge university press, 2008.
[40] J. Chen, A. Dholakia, E. Eleftheriou, M. P. Fossorier, and X.-Y.
Hu, “Reduced-complexity decoding of ldpc codes,” IEEE trans.
Commun., vol. 53, no. 8, pp. 1288–1299, 2005.
[41] S.-Y. Chung, G. D. Forney, T. J. Richardson, and R. Urbanke,
“On the design of low-density parity-check codes within 0.0045
db of the shannon limit,” IEEE Commun. lett., vol. 5, no. 2, pp.
58–60, 2001.
[42] R. Storn and K. Price, “Differential evolution–a simple and effi-
cient heuristic for global optimization over continuous spaces,”
Journal of global optimization, vol. 11, no. 4, pp. 341–359,
1997.
[43] F. Chollet, “keras,” https://github.com/keras-team/keras, 2015.
[44] M. Abadi et al., “Tensorflow: Large-scale machine learn-
ing on heterogeneous systems,” 2015. [Online]. Available:
http://tensorflow.org/.
