高信頼ソリッド・ステート・ストレージシステムに関する研究 by 田中丸 周平 & Tanakamaru Shuhei
論 文 の 内 容 の 要 旨 
 
 
 
論文題目    Design of Highly Reliable Solid-State Storage System 
(高信頼ソリッド・ステート・ストレージシステムに関する研究) 
 
           氏  名    田中丸 周平 
 
 
 
 
 
 
 
 
 
 
Solid-state storage is escalating demands due to the expanding industry and applications from 
SD cards to data centers. Solid-state storage is mainly composed of NAND flash memories, 
which are the most cost effective non-volatile memory devices. This is because the memory cell 
transistor achieves the smallest size and multi-level cell technology is realized. To further 
reduce the chip cost of the NAND flash memory, scaling of the memory cell is continuously 
promoted. However, the reliability of the NAND flash memory is degrading due to the scaling. 
The memory cells are getting close, and thus the interference between the memory cells by such 
as cell-to-cell coupling is becoming strong. Moreover, the number of electrons stored in the 
floating gate is smaller in the scaled NAND flash memory. Therefore, the reliability of the 
NAND flash memory should be improved.  
On the other hand, storage-class memories (SCMs) are attracting attention because SCMs 
show higher performance than NAND flash memory with non-volatility. Thus, the storage 
performance can be improved by combining low-cost NAND flash memories and high-speed 
SCMs. Phase-change random access memory (PRAM), magnetic RAM (MRAM), and resistive 
RAM (ReRAM) are considered as SCMs. Similar to the NAND flash memory, the reliability of 
these SCMs also degrades during write/erase cycle. Thus, the reliability of the SCMs should 
also be improved.  
This paper discusses the design of highly reliable solid-state storage starting from the 
problems of the reliability degradation of NAND flash memory, co-optimization of NAND flash 
memory and SCM, and reliability of SCM. As for the solutions, chip-level and system-level 
reliability enhancement techniques are proposed for NAND flash memories, chip-level 
techniques are proposed for ReRAMs (SCMs), and the system performance is evaluated. 
Error-prediction low-density parity-check (EP-LDPC) scheme, page-RAID, error-masking 
(EM), and bits/cell optimization (BCO) are proposed as chip-level reliability enhancement 
techniques. Bose-Chaudhuri-Hocquenghem (BCH) error-correcting codes (ECCs) are widely 
used to correct bit-errors of the NAND flash memories. However, due to the degrading 
reliability of the NAND flash memory along with the scaling, LDPC codes are considered as the 
promising candidate for the next generation ECC due to their extremely high error-correction 
capability. However, the use of the LDPC codes has a fatal problem that LDPC codes require 
multiple read from the NAND flash memory to get precise VTH information, resulting in long 
read latency. Proposed EP-LDPC can achieve higher error-correction efficiency than BCH code 
while eliminating multiple reads. 
Page-RAID creates a vertical parity in addition to the horizontal parity (ECC). As a result, 
even if there is an ECC failure, the page can be corrected by the vertical parity with page-RAID. 
EM corrects the errors which occurred in the previous read. The error location is stored in an 
error-location table in every read. The errors are corrected by bit-flipping the data stored in the 
error-location table. Both page-RAID and EM can effectively use ReRAM as a parity buffer to 
improve the reliability with no performance degradation. In BCO, triple-level cell (TLC) 
NAND flash memory is changed to multi-level cell (MLC) and single-level cell (SLC) 
according to the write/erase cycles. 
Reverse-mirroring (RM), shift-mirroring (SM), error-reduction synthesis (ERS), and balanced 
RAID-5/6 are proposed as system-level techniques for NAND flash memories. RM, SM, and 
ERS improve the system-level reliability of the mirrored system (RAID-1). RM and SM 
intelligently allocate the data to improve reliability based on the error characteristics of the 
NAND flash memory. RM also uses ReRAM as a non-volatile buffer. ERS reduces bit-errors 
considering the error characteristics. Balanced RAID-5/6 improves the reliability of the 
RAID-5/6 system. RAID-5/6 is more cost effective method than mirroring (RAID-1) to improve 
the system-level reliability due to the significantly small storage overhead.  
The chip-level reliability enhancement techniques for ReRAM are proposed such as flexible 
RRef (FR), adaptive asymmetric coding (AAC), and verify trials reduction (VTR). FR tracks the 
changing resistance of the ReRAM to reduce the BER during set/reset cycling. It is found that 
the dominant error direction (1→0 or 0→1 error) changes from 0→1 to 1→0 error during 
set/reset cycle. AAC reduces BER based on the dominant error direction. VTR reduces the 
verify trials and improves the set/reset latency by correcting the remaining errors by ECC.  
 
