1,164 research outputs found
Using Intelligent Prefetching to Reduce the Energy Consumption of a Large-scale Storage System
Many high performance large-scale storage systems will experience significant workload increases as their user base and content availability grow over time. The U.S. Geological Survey (USGS) Earth Resources Observation and Science (EROS) center hosts one such system that has recently undergone a period of rapid growth as its user population grew nearly 400% in just about three years. When administrators of these massive storage systems face the challenge of meeting the demands of an ever increasing number of requests, the easiest solution is to integrate more advanced hardware to existing systems. However, additional investment in hardware may significantly increase the system cost as well as daily power consumption. In this paper, we present evidence that well-selected software level optimization is capable of achieving comparable levels of performance without the cost and power consumption overhead caused by physically expanding the system. Specifically, we develop intelligent prefetching algorithms that are suitable for the unique workloads and user behaviors of the world\u27s largest satellite images distribution system managed by USGS EROS. Our experimental results, derived from real-world traces with over five million requests sent by users around the globe, show that the EROS hybrid storage system could maintain the same performance with over 30% of energy savings by utilizing our proposed prefetching algorithms, compared to the alternative solution of doubling the size of the current FTP server farm
CacheZoom: How SGX Amplifies The Power of Cache Attacks
In modern computing environments, hardware resources are commonly shared, and
parallel computation is widely used. Parallel tasks can cause privacy and
security problems if proper isolation is not enforced. Intel proposed SGX to
create a trusted execution environment within the processor. SGX relies on the
hardware, and claims runtime protection even if the OS and other software
components are malicious. However, SGX disregards side-channel attacks. We
introduce a powerful cache side-channel attack that provides system adversaries
a high resolution channel. Our attack tool named CacheZoom is able to virtually
track all memory accesses of SGX enclaves with high spatial and temporal
precision. As proof of concept, we demonstrate AES key recovery attacks on
commonly used implementations including those that were believed to be
resistant in previous scenarios. Our results show that SGX cannot protect
critical data sensitive computations, and efficient AES key recovery is
possible in a practical environment. In contrast to previous works which
require hundreds of measurements, this is the first cache side-channel attack
on a real system that can recover AES keys with a minimal number of
measurements. We can successfully recover AES keys from T-Table based
implementations with as few as ten measurements.Comment: Accepted at Conference on Cryptographic Hardware and Embedded Systems
(CHES '17
An Intelligent Framework for Oversubscription Management in CPU-GPU Unified Memory
This paper proposes a novel intelligent framework for oversubscription
management in CPU-GPU UVM. We analyze the current rule-based methods of GPU
memory oversubscription with unified memory, and the current learning-based
methods for other computer architectural components. We then identify the
performance gap between the existing rule-based methods and the theoretical
upper bound. We also identify the advantages of applying machine intelligence
and the limitations of the existing learning-based methods. This paper proposes
a novel intelligent framework for oversubscription management in CPU-GPU UVM.
It consists of an access pattern classifier followed by a pattern-specific
Transformer-based model using a novel loss function aiming for reducing page
thrashing. A policy engine is designed to leverage the model's result to
perform accurate page prefetching and pre-eviction. We evaluate our intelligent
framework on a set of 11 memory-intensive benchmarks from popular benchmark
suites. Our solution outperforms the state-of-the-art (SOTA) methods for
oversubscription management, reducing the number of pages thrashed by 64.4\%
under 125\% memory oversubscription compared to the baseline, while the SOTA
method reduces the number of pages thrashed by 17.3\%. Our solution achieves an
average IPC improvement of 1.52X under 125\% memory oversubscription, and our
solution achieves an average IPC improvement of 3.66X under 150\% memory
oversubscription. Our solution outperforms the existing learning-based methods
for page address prediction, improving top-1 accuracy by 6.45\% (up to 41.2\%)
on average for a single GPGPU workload, improving top-1 accuracy by 10.2\% (up
to 30.2\%) on average for multiple concurrent GPGPU workloads.Comment: arXiv admin note: text overlap with arXiv:2203.1267
- …