130 research outputs found
RowHammer: A Retrospective
This retrospective paper describes the RowHammer problem in Dynamic Random
Access Memory (DRAM), which was initially introduced by Kim et al. at the ISCA
2014 conference~\cite{rowhammer-isca2014}. RowHammer is a prime (and perhaps
the first) example of how a circuit-level failure mechanism can cause a
practical and widespread system security vulnerability. It is the phenomenon
that repeatedly accessing a row in a modern DRAM chip causes bit flips in
physically-adjacent rows at consistently predictable bit locations. RowHammer
is caused by a hardware failure mechanism called {\em DRAM disturbance errors},
which is a manifestation of circuit-level cell-to-cell interference in a scaled
memory technology.
Researchers from Google Project Zero demonstrated in 2015 that this hardware
failure mechanism can be effectively exploited by user-level programs to gain
kernel privileges on real systems. Many other follow-up works demonstrated
other practical attacks exploiting RowHammer. In this article, we
comprehensively survey the scientific literature on RowHammer-based attacks as
well as mitigation techniques to prevent RowHammer. We also discuss what other
related vulnerabilities may be lurking in DRAM and other types of memories,
e.g., NAND flash memory or Phase Change Memory, that can potentially threaten
the foundations of secure systems, as the memory technologies scale to higher
densities. We conclude by describing and advocating a principled approach to
memory reliability and security research that can enable us to better
anticipate and prevent such vulnerabilities.Comment: A version of this work is to appear at IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems (TCAD) Special Issue
on Top Picks in Hardware and Embedded Security, 2019. arXiv admin note:
substantial text overlap with arXiv:1703.00626, arXiv:1903.1105
Heterogeneous-Reliability Memory: Exploiting Application-Level Memory Error Tolerance
This paper summarizes our work on characterizing application memory error
vulnerability to optimize datacenter cost via Heterogeneous-Reliability Memory
(HRM), which was published in DSN 2014, and examines the work's significance
and future potential. Memory devices represent a key component of datacenter
total cost of ownership (TCO), and techniques used to reduce errors that occur
on these devices increase this cost. Existing approaches to providing
reliability for memory devices pessimistically treat all data as equally
vulnerable to memory errors. Our key insight is that there exists a diverse
spectrum of tolerance to memory errors in new data-intensive applications, and
that traditional one-size-fits-all memory reliability techniques are
inefficient in terms of cost. This presents an opportunity to greatly reduce
server hardware cost by provisioning the right amount of memory reliability for
different applications.
Toward this end, in our DSN 2014 paper, we make three main contributions to
enable highly-reliable servers at low datacenter cost. First, we develop a new
methodology to quantify the tolerance of applications to memory errors. Second,
using our methodology, we perform a case study of three new data-intensive
workloads (an interactive web search application, an in-memory key--value
store, and a graph mining framework) to identify new insights into the nature
of application memory error vulnerability. Third, based on our insights, we
propose several new hardware/software heterogeneous-reliability memory system
designs to lower datacenter cost while achieving high reliability and discuss
their trade-offs. We show that our new techniques can reduce server hardware
cost by 4.7% while achieving 99.90% single server availability.Comment: 4 pages, 4 figures, summary report for DSN 2014 paper:
"Characterizing Application Memory Error Vulnerability to Optimize Datacenter
Cost via Heterogeneous-Reliability Memory
Exploiting Row-Level Temporal Locality in DRAM to Reduce the Memory Access Latency
This paper summarizes the idea of ChargeCache, which was published in HPCA
2016 [51], and examines the work's significance and future potential. DRAM
latency continues to be a critical bottleneck for system performance. In this
work, we develop a low-cost mechanism, called ChargeCache, that enables faster
access to recently-accessed rows in DRAM, with no modifications to DRAM chips.
Our mechanism is based on the key observation that a recently-accessed row has
more charge and thus the following access to the same row can be performed
faster. To exploit this observation, we propose to track the addresses of
recently-accessed rows in a table in the memory controller. If a later DRAM
request hits in that table, the memory controller uses lower timing parameters,
leading to reduced DRAM latency. Row addresses are removed from the table after
a specified duration to ensure rows that have leaked too much charge are not
accessed with lower latency. We evaluate ChargeCache on a wide variety of
workloads and show that it provides significant performance and energy benefits
for both single-core and multi-core systems.Comment: arXiv admin note: substantial text overlap with arXiv:1609.0723
Tiered-Latency DRAM (TL-DRAM)
This paper summarizes the idea of Tiered-Latency DRAM, which was published in
HPCA 2013. The key goal of TL-DRAM is to provide low DRAM latency at low cost,
a critical problem in modern memory systems. To this end, TL-DRAM introduces
heterogeneity into the design of a DRAM subarray by segmenting the bitlines,
thereby creating a low-latency, low-energy, low-capacity portion in the
subarray (called the near segment), which is close to the sense amplifiers, and
a high-latency, high-energy, high-capacity portion, which is farther away from
the sense amplifiers. Thus, DRAM becomes heterogeneous with a small portion
having lower latency and a large portion having higher latency. Various
techniques can be employed to take advantage of the low-latency near segment
and this new heterogeneous DRAM substrate, including hardware-based caching and
software based caching and memory allocation of frequently used data in the
near segment. Evaluations with simple such techniques show significant
performance and energy-efficiency benefits.Comment: This is a summary of the original paper, entitled "Tiered-Latency
DRAM: A Low Latency and Low Cost DRAM Architecture" which appears in HPCA
201
Errors in Flash-Memory-Based Solid-State Drives: Analysis, Mitigation, and Recovery
NAND flash memory is ubiquitous in everyday life today because its capacity
has continuously increased and cost has continuously decreased over decades.
This positive growth is a result of two key trends: (1) effective process
technology scaling; and (2) multi-level (e.g., MLC, TLC) cell data coding.
Unfortunately, the reliability of raw data stored in flash memory has also
continued to become more difficult to ensure, because these two trends lead to
(1) fewer electrons in the flash memory cell floating gate to represent the
data; and (2) larger cell-to-cell interference and disturbance effects. Without
mitigation, worsening reliability can reduce the lifetime of NAND flash memory.
As a result, flash memory controllers in solid-state drives (SSDs) have become
much more sophisticated: they incorporate many effective techniques to ensure
the correct interpretation of noisy data stored in flash memory cells.
In this chapter, we review recent advances in SSD error characterization,
mitigation, and data recovery techniques for reliability and lifetime
improvement. We provide rigorous experimental data from state-of-the-art MLC
and TLC NAND flash devices on various types of flash memory errors, to motivate
the need for such techniques. Based on the understanding developed by the
experimental characterization, we describe several mitigation and recovery
techniques, including (1) cell-tocell interference mitigation; (2) optimal
multi-level cell sensing; (3) error correction using state-of-the-art
algorithms and methods; and (4) data recovery when error correction fails. We
quantify the reliability improvement provided by each of these techniques.
Looking forward, we briefly discuss how flash memory and these techniques could
evolve into the future.Comment: arXiv admin note: substantial text overlap with arXiv:1706.0864
Flexible-Latency DRAM: Understanding and Exploiting Latency Variation in Modern DRAM Chips
This article summarizes key results of our work on experimental
characterization and analysis of latency variation and latency-reliability
trade-offs in modern DRAM chips, which was published in SIGMETRICS 2016, and
examines the work's significance and future potential.
The goal of this work is to (i) experimentally characterize and understand
the latency variation across cells within a DRAM chip for these three
fundamental DRAM operations, and (ii) develop new mechanisms that exploit our
understanding of the latency variation to reliably improve performance. To this
end, we comprehensively characterize 240 DRAM chips from three major vendors,
and make six major new observations about latency variation within DRAM.
Notably, we find that (i) there is large latency variation across the cells for
each of the three operations; (ii) variation characteristics exhibit
significant spatial locality: slower cells are clustered in certain regions of
a DRAM chip; and (iii) the three fundamental operations exhibit different
reliability characteristics when the latency of each operation is reduced.
Based on our observations, we propose Flexible-LatencY DRAM (FLY-DRAM), a
mechanism that exploits latency variation across DRAM cells within a DRAM chip
to improve system performance. The key idea of FLY-DRAM is to exploit the
spatial locality of slower cells within DRAM, and access the faster DRAM
regions with reduced latencies for the fundamental operations. Our evaluations
show that FLY-DRAM improves the performance of a wide range of applications by
13.3%, 17.6%, and 19.5%, on average, for each of the three different vendors'
real DRAM chips, in a simulated 8-core system
Recent Advances in DRAM and Flash Memory Architectures
This article features extended summaries and retrospectives of some of the
recent research done by our group, SAFARI, on (1) understanding,
characterizing, and modeling various critical properties of modern DRAM and
NAND flash memory, the dominant memory and storage technologies, respectively;
and (2) several new mechanisms we have proposed based on our observations from
these analyses, characterization, and modeling, to tackle various key
challenges in memory and storage scaling. In order to understand the sources of
various bottlenecks of the dominant memory and storage technologies, these
works perform rigorous studies of device-level and application-level behavior,
using a combination of detailed simulation and experimental characterization of
real memory and storage devices.Comment: arXiv admin note: substantial text overlap with arXiv:1805.0640
Characterizing, Exploiting, and Mitigating Vulnerabilities in MLC NAND Flash Memory Programming
This paper summarizes our work on experimentally analyzing, exploiting, and
addressing vulnerabilities in multi-level cell NAND flash memory programming,
which was published in the industrial session of HPCA 2017, and examines the
work's significance and future potential. Modern NAND flash memory chips use
multi-level cells (MLC), which store two bits of data in each cell, to improve
chip density. As MLC NAND flash memory scaled down to smaller manufacturing
process technologies, manufacturers adopted a two-step programming method to
improve reliability. In two-step programming, the two bits of a multi-level
cell are programmed using two separate steps, in order to minimize the amount
of cell-to-cell program interference induced on neighboring flash cells.
In this work, we demonstrate that two-step programming exposes new
reliability and security vulnerabilities in state-of-the-art MLC NAND flash
memory. We experimentally characterize contemporary 1X-nm (i.e., 15--19nm)
flash memory chips, and find that a partially-programmed flash cell (i.e., a
cell where the second programming step has not yet been performed) is much more
vulnerable to cell-to-cell interference and read disturb than a
fully-programmed cell. We show that it is possible to exploit these
vulnerabilities on solid-state drives (SSDs) to alter the partially-programmed
data, causing (potentially malicious) data corruption. Based on our
observations, we propose several new mechanisms that eliminate or mitigate
these vulnerabilities in partially-programmed cells, and at the same time
increase flash memory lifetime by 16%
Tiered-Latency DRAM: Enabling Low-Latency Main Memory at Low Cost
This paper summarizes the idea of Tiered-Latency DRAM (TL-DRAM), which was
published in HPCA 2013, and examines the work's significance and future
potential. The capacity and cost-per-bit of DRAM have historically scaled to
satisfy the needs of increasingly large and complex computer systems. However,
DRAM latency has remained almost constant, making memory latency the
performance bottleneck in today's systems. We observe that the high access
latency is not intrinsic to DRAM, but a trade-off is made to decrease the cost
per bit. To mitigate the high area overhead of DRAM sensing structures,
commodity DRAMs connect many DRAM cells to each sense amplifier through a wire
called a bitline. These bit-lines have a high parasitic capacitance due to
their long length, and this bitline capacitance is the dominant source of DRAM
latency. Specialized low-latency DRAMs use shorter bitlines with fewer cells,
but have a higher cost-per-bit due to greater sense amplifier area overhead. To
achieve both low latency and low cost per bit, we introduce Tiered-Latency DRAM
(TL-DRAM). In TL-DRAM, each long bitline is split into two shorter segments by
an isolation transistor, allowing one of the two segments to be accessed with
the latency of a short-bitline DRAM without incurring a high cost per bit. We
propose mechanisms that use the low-latency segment as a hardware-managed or
software-managed cache. Our evaluations show that our proposed mechanisms
improve both performance and energy efficiency for both single-core and
multiprogrammed workloads. Tiered-Latency DRAM has inspired several other works
on reducing DRAM latency with little to no architectural modification.Comment: arXiv admin note: substantial text overlap with arXiv:1601.0690
Experimental Characterization, Optimization, and Recovery of Data Retention Errors in MLC NAND Flash Memory
This paper summarizes our work on experimentally characterizing, mitigating,
and recovering data retention errors in multi-level cell (MLC) NAND flash
memory, which was published in HPCA 2015, and examines the work's significance
and future potential. Retention errors, caused by charge leakage over time, are
the dominant source of flash memory errors. Understanding, characterizing, and
reducing retention errors can significantly improve NAND flash memory
reliability and endurance. In this work, we first characterize, with real 2Y-nm
MLC NAND flash chips, how the threshold voltage distribution of flash memory
changes with different retention ages -- the length of time since a flash cell
was programmed. We observe from our characterization results that 1) the
optimal read reference voltage of a flash cell, using which the data can be
read with the lowest raw bit error rate (RBER), systematically changes with its
retention age, and 2) different regions of flash memory can have different
retention ages, and hence different optimal read reference voltages.
Based on our findings, we propose two new techniques. First, Retention
Optimized Reading (ROR) adaptively learns and applies the optimal read
reference voltage for each flash memory block online. The key idea of ROR is to
periodically learn a tight upper bound of the optimal read reference voltage,
and from there approach the optimal read reference voltage. Our evaluations
show that ROR can extend flash memory lifetime by 64% and reduce average error
correction latency by 10.1%. Second, Retention Failure Recovery (RFR) recovers
data with uncorrectable errors offline by identifying and probabilistically
correcting flash cells with retention errors. Our evaluation shows that RFR
essentially doubles the error correction capability
- …