1,962 research outputs found
Scrooge Attack: Undervolting ARM Processors for Profit
Latest ARM processors are approaching the computational power of x86
architectures while consuming much less energy. Consequently, supply follows
demand with Amazon EC2, Equinix Metal and Microsoft Azure offering ARM-based
instances, while Oracle Cloud Infrastructure is about to add such support. We
expect this trend to continue, with an increasing number of cloud providers
offering ARM-based cloud instances.
ARM processors are more energy-efficient leading to substantial electricity
savings for cloud providers. However, a malicious cloud provider could
intentionally reduce the CPU voltage to further lower its costs. Running
applications malfunction when the undervolting goes below critical thresholds.
By avoiding critical voltage regions, a cloud provider can run undervolted
instances in a stealthy manner.
This practical experience report describes a novel attack scenario: an attack
launched by the cloud provider against its users to aggressively reduce the
processor voltage for saving energy to the last penny. We call it the Scrooge
Attack and show how it could be executed using ARM-based computing instances.
We mimic ARM-based cloud instances by deploying our own ARM-based devices using
different generations of Raspberry Pi. Using realistic and synthetic workloads,
we demonstrate to which degree of aggressiveness the attack is relevant. The
attack is unnoticeable by our detection method up to an offset of -50mV. We
show that the attack may even remain completely stealthy for certain workloads.
Finally, we propose a set of client-based detection methods that can identify
undervolted instances. We support experimental reproducibility and provide
instructions to reproduce our results.Comment: European Commission Project: LEGaTO - Low Energy Toolset for
Heterogeneous Computing (EC-H2020-780681
Embedded System Optimization of Radar Post-processing in an ARM CPU Core
Algorithms executed on the radar processor system contributes to a significant performance bottleneck of the overall radar system. One key performance concern is
the latency in target detection when dealing with hard deadline systems. Research has shown software optimization as one major contributor to radar system performance
improvements. This thesis aims at software optimizations using a manual and automatic approach and analyzing the results to make informed future decisions
while working with an ARM processor system. In order to ascertain an optimized implementation, a question put forward was whether the algorithms on the ARM
processor could work with a 6-antenna implementation without a decline in the performance. However, an answer would also help project how many additional
algorithms can still be added without performance decline.
The manual optimization was done based on the quantitative analysis of the software execution time. The manual optimization approach looked at the vectorization
strategy using the NEON vector register on the ARM CPU to reimplement the initial Constant False Alarm Rate(CFAR) Detection algorithm. An additional
optimization approach was eliminating redundant loops while going through the Range Gates and Doppler filters. In order to determine the best compiler for automatic
code optimization for the radar algorithms on the ARM processor, the GCC and Clang compilers were used to compile the initial algorithms and the optimized
implementation on the radar post-processing stage.
Analysis of the optimization results showed that it is possible to run the radar post-processing algorithms on the ARM processor at the 6-antenna implementation
without system load stress. In addition, the results show an excellent headroom margin based on the defined scenario. The result analysis further revealed that the
effect of dynamic memory allocation could not be underrated in situations where performance is a significant concern. Additional statements from the result demonstrated
that the GCC and Clang compiler has their strength and weaknesses when used in the compilation. One limiting factor to note on the optimization using the
NEON register is the sample size’s effect on the optimization implementation. Although it fits into the test samples used based on the defined scenario, there might
be varying results in varying window cell size situations that might not necessarily improve the time constraints
- …