11 research outputs found
Architectural Techniques for Disturbance Mitigation in Future Memory Systems
With the recent advancements of CMOS technology, scaling down the feature size has improved memory capacity, power, performance and cost. However, such dramatic progress in memory technology has increasingly made the precise control of the manufacturing process below 22nm more difficult. In spite of all these virtues, the technology scaling road map predicts significant process variation from cell-to-cell. It also predicts electromagnetic disturbances among memory cells that easily deviate their circuit characterizations from design goals and pose threats to the reliability, energy efficiency and security.
This dissertation proposes simple, energy-efficient and low-overhead techniques that combat the challenges resulting from technology scaling in future memory systems. Specifically, this dissertation investigates solutions tuned to particular types of disturbance challenges, such as inter-cell or intra-cell disturbance, that are energy efficient while guaranteeing memory reliability.
The contribution of this dissertation will be threefold. First, it uses a deterministic
counter-based approach to target the root of inter-cell disturbances in Dynamic random access memory (DRAM) and provide further benefits to overall energy consumption while deterministically mitigating inter-cell disturbances. Second, it uses Markov chains to reason about the reliability of Spin-Transfer Torque Magnetic Random-Access Memory (STT-RAM) that suffers from intra-cell disturbances and then investigates on-demand refresh policies to recover from the persistent effect of such disturbances. Third, It leverages an encoding technique integrated with a novel word level compression scheme to reduce the vulnerability of cells to inter-cell write disturbances in Phase Change Memory (PCM). However, mitigating inter-cell write disturbances and also minimizing the write energy may increase the number of updated PCM cells and result in degraded endurance. Hence, It uses multi-objective optimization to balance the write energy and endurance in PCM cells while mitigating intercell disturbances.
The work in this dissertation provides important insights into how to tackle the critical reliability challenges that high-density memory systems confront in deep scaled technology nodes. It advocates for various memory technologies to guarantee reliability of future memory systems while incurring nominal costs in terms of energy, area and performance
A Case for Self-Managing DRAM Chips: Improving Performance, Efficiency, Reliability, and Security via Autonomous in-DRAM Maintenance Operations
The memory controller is in charge of managing DRAM maintenance operations
(e.g., refresh, RowHammer protection, memory scrubbing) in current DRAM chips.
Implementing new maintenance operations often necessitates modifications in the
DRAM interface, memory controller, and potentially other system components.
Such modifications are only possible with a new DRAM standard, which takes a
long time to develop, leading to slow progress in DRAM systems.
In this paper, our goal is to 1) ease, and thus accelerate, the process of
enabling new DRAM maintenance operations and 2) enable more efficient in-DRAM
maintenance operations. Our idea is to set the memory controller free from
managing DRAM maintenance. To this end, we propose Self-Managing DRAM (SMD), a
new low-cost DRAM architecture that enables implementing new in-DRAM
maintenance mechanisms (or modifying old ones) with no further changes in the
DRAM interface, memory controller, or other system components. We use SMD to
implement new in-DRAM maintenance mechanisms for three use cases: 1) periodic
refresh, 2) RowHammer protection, and 3) memory scrubbing. We show that SMD
enables easy adoption of efficient maintenance mechanisms that significantly
improve the system performance and energy efficiency while providing higher
reliability compared to conventional DDR4 DRAM. A combination of SMD-based
maintenance mechanisms that perform refresh, RowHammer protection, and memory
scrubbing achieve 7.6% speedup and consume 5.2% less DRAM energy on average
across 20 memory-intensive four-core workloads. We make SMD source code openly
and freely available at [128]
RAMPART: RowHammer Mitigation and Repair for Server Memory Systems
RowHammer attacks are a growing security and reliability concern for DRAMs
and computer systems as they can induce many bit errors that overwhelm error
detection and correction capabilities. System-level solutions are needed as
process technology and circuit improvements alone are unlikely to provide
complete protection against RowHammer attacks in the future. This paper
introduces RAMPART, a novel approach to mitigating RowHammer attacks and
improving server memory system reliability by remapping addresses in each DRAM
in a way that confines RowHammer bit flips to a single device for any victim
row address. When RAMPART is paired with Single Device Data Correction (SDDC)
and patrol scrub, error detection and correction methods in use today, the
system can detect and correct bit flips from a successful attack, allowing the
memory system to heal itself. RAMPART is compatible with DDR5 RowHammer
mitigation features, as well as a wide variety of algorithmic and probabilistic
tracking methods. We also introduce BRC-VL, a variation of DDR5 Bounded Refresh
Configuration (BRC) that improves system performance by reducing mitigation
overhead and show that it works well with probabilistic sampling methods to
combat traditional and victim-focused mitigation attacks like Half-Double. The
combination of RAMPART, SDDC, and scrubbing enables stronger RowHammer
resistance by correcting bit flips from one successful attack. Uncorrectable
errors are much less likely, requiring two successful attacks before the memory
system is scrubbed.Comment: 16 pages, 13 figures. A version of this paper will appear in the
Proceedings of MEMSYS2
DRAM Bender: An Extensible and Versatile FPGA-based Infrastructure to Easily Test State-of-the-art DRAM Chips
To understand and improve DRAM performance, reliability, security and energy
efficiency, prior works study characteristics of commodity DRAM chips.
Unfortunately, state-of-the-art open source infrastructures capable of
conducting such studies are obsolete, poorly supported, or difficult to use, or
their inflexibility limit the types of studies they can conduct.
We propose DRAM Bender, a new FPGA-based infrastructure that enables
experimental studies on state-of-the-art DRAM chips. DRAM Bender offers three
key features at the same time. First, DRAM Bender enables directly interfacing
with a DRAM chip through its low-level interface. This allows users to issue
DRAM commands in arbitrary order and with finer-grained time intervals compared
to other open source infrastructures. Second, DRAM Bender exposes easy-to-use
C++ and Python programming interfaces, allowing users to quickly and easily
develop different types of DRAM experiments. Third, DRAM Bender is easily
extensible. The modular design of DRAM Bender allows extending it to (i)
support existing and emerging DRAM interfaces, and (ii) run on new commercial
or custom FPGA boards with little effort.
To demonstrate that DRAM Bender is a versatile infrastructure, we conduct
three case studies, two of which lead to new observations about the DRAM
RowHammer vulnerability. In particular, we show that data patterns supported by
DRAM Bender uncovers a larger set of bit-flips on a victim row compared to the
data patterns commonly used by prior work. We demonstrate the extensibility of
DRAM Bender by implementing it on five different FPGAs with DDR4 and DDR3
support. DRAM Bender is freely and openly available at
https://github.com/CMU-SAFARI/DRAM-Bender.Comment: To appear in TCAD 202
ํ์ ์๋์ฐ ์นด์ดํฐ๋ฅผ ํ์ฉํ ๋ก์ฐ ํด๋จธ๋ง ๋ฐฉ์ง ๋ฐ ์ฃผ๊ธฐ์ต์ฅ์น ์ฑ๋ฅ ํฅ์
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ) -- ์์ธ๋ํ๊ต ๋ํ์ : ์ตํฉ๊ณผํ๊ธฐ์ ๋ํ์ ์ตํฉ๊ณผํ๋ถ(์ง๋ฅํ์ตํฉ์์คํ
์ ๊ณต), 2020. 8. ์์ ํธ.Computer systems using DRAM are exposed to row-hammer (RH) attacks, which can flip data in a DRAM row without directly accessing a row but by frequently activating its adjacent ones. There have been a number of proposals to prevent RH, including both probabilistic and deterministic solutions. However, the probabilistic solutions provide protection with no capability to detect attacks and have a non-zero probability for missing protection. Otherwise, counter-based deterministic solutions either incur large area overhead or suffer from noticeable performance drop on adversarial memory access patterns.
To overcome these challenges, we propose a new counter-based RH prevention solution named Time Window Counter (TWiCe) based row refresh, which accurately detects potential RH attacks only using a small number of counters with a minimal performance impact. We first make a key observation that the number of rows that can cause RH is limited by the maximum values of row activation frequency and DRAM cell retention time. We calculate the maximum number of required counter entries per DRAM bank, with which TWiCe prevents RH with a strong deterministic guarantee. TWiCe incurs no performance overhead on normal DRAM operations and less than 0.7% area and energy overheads over contemporary DRAM devices. Our evaluation shows that TWiCe makes no more than 0.006% of additional DRAM row activations for adversarial memory access patterns, including RH attack scenarios.
To reduce the area and energy overhead further, we propose the threshold adjusted rank-level TWiCe. We first introduce pseudo-associative TWiCe (pa-TWiCe) that can search for hundreds of TWiCe table entries energy-efficiently. In addition, by exploiting pa-TWiCe structure, we propose rank-level TWiCe that reduces the number of required entries further by managing the table entries at a rank-level. We also adjust the thresholds of TWiCe to reduce the number of entries without the increase of false-positive detection on general workloads.
Finally, we propose extend TWiCe as a hot-page detector to improve main-memory performance. TWiCe table contains the row addresses that have been frequently activated recently, and they are likely to be activated again due to temporal locality in memory accesses. We show how the hot-page detection in TWiCe can be combined with a DRAM page swap methodology to reduce the DRAM latency for the hot pages. Also, our evaluation shows that low-latency DRAM using TWiCe achieves up to 12.2% IPC improvement over a baseline DDR4 device for a multi-threaded workload.DRAM์ ์ฃผ๊ธฐ์ต์ฅ์น๋ก ์ฌ์ฉํ๋ ์ปดํจํฐ ์์คํ
์ ๋ก์ฐ ํด๋จธ๋ง ๊ณต๊ฒฉ์ ๋
ธ์ถ๋๋ค. ๋ก์ฐ ํด๋จธ๋ง์ ์ธ์ DRAM ๋ก์ฐ๋ฅผ ์์ฃผ activationํจ์ผ๋ก์จ ํน์ DRAM ๋ก์ฐ ๋ฐ์ดํฐ์ ์ง์ ์ ๊ทผํ์ง ์๊ณ ์๋ ๋ฐ์ดํฐ๋ฅผ ๋ค์ง์ ์ ์๋ ํ์์ ๋งํ๋ค. ์ด๋ฌํ ๋ก์ฐ ํด๋จธ๋ง ํ์์ ๋ฐฉ์งํ๊ธฐ ์ํด ์ฌ๋ฌ๊ฐ์ง ํ๋ฅ ์ ์ธ ๋ฐฉ์ง ๊ธฐ๋ฒ๊ณผ ๊ฒฐ์ ๋ก ์ ๋ฐฉ์ง ๊ธฐ๋ฒ๋ค์ด ์ฐ๊ตฌ๋์ด ์๋ค. ๊ทธ๋ฌ๋, ํ๋ฅ ์ ์ธ ๋ฐฉ์ง ๊ธฐ๋ฒ์ ๊ณต๊ฒฉ ์์ฒด๋ฅผ ํ์งํ ์ ์๊ณ , ๋ฐฉ์ง์ ์คํจํ ํ๋ฅ ์ด 0์ด ์๋๋ผ๋ ํ๊ณ๊ฐ ์๋ค. ๋ํ ๊ธฐ์กด์ ์นด์ดํฐ๋ฅผ ํ์ฉํ ๊ฒฐ์ ๋ก ์ ๋ฐฉ์ง ๊ธฐ๋ฒ๋ค์ ํฐ ์นฉ ๋ฉด์ ๋น์ฉ์ ๋ฐ์์ํค๊ฑฐ๋ ํน์ ๋ฉ๋ชจ๋ฆฌ ์ ๊ทผ ํจํด์์ ํ์ ํ ์ฑ๋ฅ ํ๋ฝ์ ์ผ๊ธฐํ๋ค๋ ๋จ์ ์ด ์๋ค.
์ด๋ฌํ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด, ์ฐ๋ฆฌ๋ TWiCe (Time Window Counter based row refresh)๋ผ๋ ์๋ก์ด ์นด์ดํฐ ๊ธฐ๋ฐ ๊ฒฐ์ ๋ก ์ ๋ฐฉ์ง ๊ธฐ๋ฒ์ ์ ์ํ๋ค. TWiCe๋ ์ ์ ์์ ์นด์ดํฐ๋ฅผ ํ์ฉํ์ฌ ๋ก์ฐ ํด๋จธ๋ง ๊ณต๊ฒฉ์ ์ ํํ๊ฒ ํ์งํ๋ฉด์๋ ์ฑ๋ฅ์ ์
์ํฅ์ ์ต์ํํ๋ ๋ฐฉ๋ฒ์ด๋ค. ์ฐ๋ฆฌ๋ DRAM ํ์ด๋ฐ ํ๋ผ๋ฏธํฐ์ ์ํด ๋ก์ฐ activation ๋น๋๊ฐ ์ ํ๋๊ณ DRAM ์
์ด ์ฃผ๊ธฐ์ ์ผ๋ก ๋ฆฌํ๋ ์ ๋๊ธฐ ๋๋ฌธ์ ๋ก์ฐ ํด๋จธ๋ง์ ์ผ๊ธฐํ ์ ์๋ DRAM ๋ก์ฐ์ ์๊ฐ ํ์ ๋๋ค๋ ์ฌ์ค์ ์ฃผ๋ชฉํ์๋ค. ์ด๋ก๋ถํฐ ์ฐ๋ฆฌ๋ TWiCe๊ฐ ํ์คํ ๊ฒฐ์ ๋ก ์ ๋ฐฉ์ง๋ฅผ ๋ณด์ฅํ ๊ฒฝ์ฐ ํ์ํ DRAM ๋ฑ
ํฌ ๋น ํ์ํ ์นด์ดํฐ ์์ ์ต๋๊ฐ์ ๊ตฌํ์๋ค. TWiCe๋ ์ผ๋ฐ์ ์ธ DRAM ๋์ ๊ณผ์ ์์๋ ์ฑ๋ฅ์ ์๋ฌด๋ฐ ์ํฅ์ ๋ฏธ์น์ง ์์ผ๋ฉฐ, ํ๋ DRAM ๋๋ฐ์ด์ค์์ 0.7% ์ดํ์ ์นฉ ๋ฉด์ ์ฆ๊ฐ ๋ฐ ์๋์ง ์ฆ๊ฐ๋ง์ ํ์๋ก ํ๋ค. ์ฐ๋ฆฌ๊ฐ ์งํํ ํ๊ฐ์์ TWiCe๋ ๋ก์ฐ ํด๋จธ๋ง ๊ณต๊ฒฉ ์๋๋ฆฌ์ค๋ฅผ ํฌํจํ ์ฌ๋ฌ๊ฐ์ง ๋ฉ๋ชจ๋ฆฌ ์ ๊ทผ ํจํด์์ 0.006% ์ดํ์ ์ถ๊ฐ์ ์ธ DRAM activation์ ์๊ตฌํ์๋ค.
๋ํ TWiCe์ ์นฉ ๋ฉด์ ๋ฐ ์๋์ง ๋น์ฉ์ ๋์ฑ ์ค์ด๊ธฐ ์ํ์ฌ, ์ฐ๋ฆฌ๋ threshold๊ฐ ์กฐ์ ๋ ๋ญํฌ ๋จ์ TWiCe๋ฅผ ์ ์ํ๋ค. ๋จผ์ , ์๋ฐฑ๊ฐ๊ฐ ๋๋ TWiCe ํ
์ด๋ธ ํญ๋ชฉ ๊ฒ์์ ์๋์ง ํจ์จ์ ์ผ๋ก ์ํํ ์ ์๋ pa-TWiCe (pseudo-associatvie TWiCe)๋ฅผ ์ ์ํ์๋ค. ๊ทธ๋ฆฌ๊ณ , ํ
์ด๋ธ ํญ๋ชฉ์ ๋ญํฌ ๋จ์๋ก ๊ด๋ฆฌํ์ฌ ํ์ํ ํ
์ด๋ธ ํญ๋ชฉ์ ์๋ฅผ ๋์ฑ ์ค์ธ ๋ญํฌ ๋จ์ TWiCe๋ฅผ ์ ์ํ์๋ค. ๋ํ, ์ฐ๋ฆฌ๋ TWiCe์ threshold ๊ฐ์ ์กฐ์ ํจ์ผ๋ก์จ ์ผ๋ฐ์ ์ธ ์ํฌ๋ก๋ ์์์ ๊ฑฐ์ง ์์ฑ(false-positive) ํ์ง๋ฅผ ์ฆ๊ฐ์ํค์ง ์๋ ์ ์์ TWiCe์ ํ
์ด๋ธ ํญ๋ชฉ ์๋ฅผ ๋์ฑ ์ค์๋ค.
๋ง์ง๋ง์ผ๋ก, ์ฐ๋ฆฌ๋ ์ปดํจํฐ ์์คํ
์ ์ฃผ๊ธฐ์ต์ฅ์น ์ฑ๋ฅ ํฅ์์ ์ํด TWiCe๋ฅผ hot-page ๊ฐ์ง๊ธฐ๋ก ์ฌ์ฉํ๋ ๊ฒ์ ์ ์ํ๋ค. ๋ฉ๋ชจ๋ฆฌ ์ ๊ทผ์ ์๊ฐ์ ์ง์ญ์ฑ์ ์ํด ์ต๊ทผ ์์ฃผ activation๋ DRAM ๋ก์ฐ๋ค์ ๋ค์ activation๋ ํ๋ฅ ์ด ๋๊ณ , TWiCe๋ ์ต๊ทผ ์์ฃผ activation๋ DRAM ๋ก์ฐ์ ๋ํ ์ ๋ณด๋ฅผ ๊ฐ์ง๊ณ ์๋ค. ์ด๋ฌํ ์ฌ์ค์ ๊ธฐ๋ฐํ์ฌ, ์ฐ๋ฆฌ๋ hot-page์ ๋ํ DRAM ์ ๊ทผ ์ง์ฐ์๊ฐ์ ์ค์ด๋ DRAM ํ์ด์ง ์ค์(swap) ๊ธฐ๋ฒ๋ค์ TWiCe๋ฅผ ์ ์ฉํ๋ ๋ฐฉ๋ฒ์ ๋ณด์ธ๋ค. ์ฐ๋ฆฌ๊ฐ ์ํํ ํ๊ฐ์์ TWiCe๋ฅผ ์ฌ์ฉํ ์ ์ง์ฐ์๊ฐ DRAM์ ๋ฉํฐ ์ฐ๋ ๋ฉ ์ํฌ๋ก๋๋ค์์ ๊ธฐ์กด DDR4 ๋๋ฐ์ด์ค ๋๋น IPC๋ฅผ ์ต๋ 12.2% ์ฆ๊ฐ์์ผฐ๋ค.Introduction 1
1.1 Time Window Counter Based Row Refresh to Prevent Row-hammering 2
1.2 Optimizing Time Window Counter 6
1.3 Using Time Window Counters to Improve Main Memory Performance 8
1.4 Outline 10
Background of DRAM and Row-hammering 11
2.1 DRAM Device Organization 12
2.2 Sparing DRAM Rows to Combat Reliability Challenges 13
2.3 Main Memory Subsystem Organization and Operation 14
2.4 Row-hammering (RH) 18
2.5 Previous RH Prevention Solutions 20
2.6 Limitations of the Previous RH Solutions 21
TWiCe: Time Window Counter based RH Prevention 26
3.1 TWiCe: Time Window Counter 26
3.2 Proof of RH Prevention 30
3.3 Counter Table Size 33
3.4 Architecting TWiCe 35
3.4.1 Location of TWiCe Table 35
3.4.2 Augmenting DRAM Interface with a New Adjacent Row Refresh (ARR) Command 37
3.5 Analysis 40
3.6 Evaluation 42
Optimizing TWiCe to Reduce Implementation Cost 47
4.1 Pseudo-associative TWiCe 47
4.2 Rank-level TWiCe 50
4.3 Adjusting Threshold to Reduce Table Size 55
4.4 Analysis 57
4.5 Evaluation 59
Augmenting TWiCe for Hot-page Detection 62
5.1 Necessity of Counters for Detecting Hot Pages 62
5.2 Previous Studies on Migration for Asymmetric Low-latency DRAM 64
5.3 Extending TWiCe for Dynamic Hot-page Detection 67
5.4 Additional Components and Methodology 70
5.5 Analysis and Evaluation 73
5.5.1 Overhead Analysis 73
5.5.2 Evaluation 75
Conclusion 82
6.1 Future work 84
Bibliography 85
๊ตญ๋ฌธ์ด๋ก 94Docto
Securing in-memory processors against Row Hammering Attacks
Modern applications on general purpose processors require both rapid and power-efficient computing and memory components. As applications continue to improve, the demand for high speed computation, fast-access memory, and a secure platform increases. Traditional Von Neumann Architectures split the computing and memory units, causing both latency and high power-consumption issues; henceforth, a hybrid memory processing system is proposed, known as in-memory processing. In-memory processing alleviates the delay of computation and minimizes power-consumption; such improvements saw a 14x speedup improvement, 87\% fewer power consumption, and appropriate linear scalability versus performance. Several applications of in-memory processing include data-driven applications such as Artificial Intelligence (AI), Convolutional and Deep Neural Networks (CNNs/DNNs). However, processing-in-memory can also suffer from a security and reliability issue known as the Row Hammer Security Bug; this security exploit flips bits within memory without access, leading to error injection, system crashes, privilege separation, and total hijack of a system; the novel Row Hammer security bug can negatively impact the accuracies of CNNs and DNNs via flipping the bits of stored weight values without direct access. Weights of neural networks are stored in a variety of data patterns, resulting in either a solid (all 1s or all 0s), checkered (alternating 1s and 0s in both rows and columns), row-stripe (alternating 1s and 0s in rows), or column-striped (alternating 1s and 0s in columns) manner; the row-stripe data pattern exhibits the largest likelihood of a Row Hammer attack, resulting in the accuracies of neural networks dropping over 30\%. A row-stripe avoidance coding scheme is proposed to reduce the probability of the Row Hammer Attack occurring within neural networks. The coding scheme encodes the binary portion of a weight in a CNN or DNN to reduce the chance of row-stripe data patterns, overall reducing the likelihood of a Row Hammer attack occurring while improving the overall security of the in-memory processing system
Scalable and Secure Row-Swap: Efficient and Safe Row Hammer Mitigation in Memory Systems
As Dynamic Random Access Memories (DRAM) scale, they are becoming
increasingly susceptible to Row Hammer. By rapidly activating rows of DRAM
cells (aggressor rows), attackers can exploit inter-cell interference through
Row Hammer to flip bits in neighboring rows (victim rows). A recent work,
called Randomized Row-Swap (RRS), proposed proactively swapping aggressor rows
with randomly selected rows before an aggressor row can cause Row Hammer.
Our paper observes that RRS is neither secure nor scalable. We first propose
the `Juggernaut attack pattern' that breaks RRS in under 1 day. Juggernaut
exploits the fact that the mitigative action of RRS, a swap operation, can
itself induce additional target row activations, defeating such a defense.
Second, this paper proposes a new defense Secure Row-Swap mechanism that avoids
the additional activations from swap (and unswap) operations and protects
against Juggernaut. Furthermore, this paper extends Secure Row-Swap with attack
detection to defend against even future attacks. While this provides better
security, it also allows for securely reducing the frequency of swaps, thereby
enabling Scalable and Secure Row-Swap. The Scalable and Secure Row-Swap
mechanism provides years of Row Hammer protection with 3.3X lower storage
overheads as compared to the RRS design. It incurs only a 0.7% slowdown as
compared to a not-secure baseline for a Row Hammer threshold of 1200
Randomized Line-to-Row Mapping for Low-Overhead Rowhammer Mitigations
Modern systems mitigate Rowhammer using victim refresh, which refreshes the
two neighbours of an aggressor row when it encounters a specified number of
activations. Unfortunately, complex attack patterns like Half-Double break
victim-refresh, rendering current systems vulnerable. Instead, recently
proposed secure Rowhammer mitigations rely on performing mitigative action on
the aggressor rather than the victims. Such schemes employ mitigative actions
such as row-migration or access-control and include AQUA, SRS, and Blockhammer.
While these schemes incur only modest slowdowns at Rowhammer thresholds of few
thousand, they incur prohibitive slowdowns (15%-600%) for lower thresholds that
are likely in the near future. The goal of our paper is to make secure
Rowhammer mitigations practical at such low thresholds.
Our paper provides the key insights that benign application encounter
thousands of hot rows (receiving more activations than the threshold) due to
the memory mapping, which places spatially proximate lines in the same row to
maximize row-buffer hitrate. Unfortunately, this causes row to receive
activations for many frequently used lines. We propose Rubix, which breaks the
spatial correlation in the line-to-row mapping by using an encrypted address to
access the memory, reducing the likelihood of hot rows by 2 to 3 orders of
magnitude. To aid row-buffer hits, Rubix randomizes a group of 1-4 lines. We
also propose Rubix-D, which dynamically changes the line-to-row mapping.
Rubix-D minimizes hot-rows and makes it much harder for an adversary to learn
the spatial neighbourhood of a row. Rubix reduces the slowdown of AQUA (from
15% to 1%), SRS (from 60% to 2%), and Blockhammer (from 600% to 3%) while
incurring a storage of less than 1 Kilobyte