4,551 research outputs found
Less is More: Exploiting the Standard Compiler Optimization Levels for Better Performance and Energy Consumption
This paper presents the interesting observation that by performing fewer of
the optimizations available in a standard compiler optimization level such as
-O2, while preserving their original ordering, significant savings can be
achieved in both execution time and energy consumption. This observation has
been validated on two embedded processors, namely the ARM Cortex-M0 and the ARM
Cortex-M3, using two different versions of the LLVM compilation framework; v3.8
and v5.0. Experimental evaluation with 71 embedded benchmarks demonstrated
performance gains for at least half of the benchmarks for both processors. An
average execution time reduction of 2.4% and 5.3% was achieved across all the
benchmarks for the Cortex-M0 and Cortex-M3 processors, respectively, with
execution time improvements ranging from 1% up to 90% over the -O2. The savings
that can be achieved are in the same range as what can be achieved by the
state-of-the-art compilation approaches that use iterative compilation or
machine learning to select flags or to determine phase orderings that result in
more efficient code. In contrast to these time consuming and expensive to apply
techniques, our approach only needs to test a limited number of optimization
configurations, less than 64, to obtain similar or even better savings.
Furthermore, our approach can support multi-criteria optimization as it targets
execution time, energy consumption and code size at the same time.Comment: 15 pages, 3 figures, 71 benchmarks used for evaluatio
Lost in translation: Exposing hidden compiler optimization opportunities
Existing iterative compilation and machine-learning-based optimization
techniques have been proven very successful in achieving better optimizations
than the standard optimization levels of a compiler. However, they were not
engineered to support the tuning of a compiler's optimizer as part of the
compiler's daily development cycle. In this paper, we first establish the
required properties which a technique must exhibit to enable such tuning. We
then introduce an enhancement to the classic nightly routine testing of
compilers which exhibits all the required properties, and thus, is capable of
driving the improvement and tuning of the compiler's common optimizer. This is
achieved by leveraging resource usage and compilation information collected
while systematically exploiting prefixes of the transformations applied at
standard optimization levels. Experimental evaluation using the LLVM v6.0.1
compiler demonstrated that the new approach was able to reveal hidden
cross-architecture and architecture-dependent potential optimizations on two
popular processors: the Intel i5-6300U and the Arm Cortex-A53-based Broadcom
BCM2837 used in the Raspberry Pi 3B+. As a case study, we demonstrate how the
insights from our approach enabled us to identify and remove a significant
shortcoming of the CFG simplification pass of the LLVM v6.0.1 compiler.Comment: 31 pages, 7 figures, 2 table. arXiv admin note: text overlap with
arXiv:1802.0984
Chaotic Compilation for Encrypted Computing: Obfuscation but Not in Name
An `obfuscation' for encrypted computing is quantified exactly here, leading
to an argument that security against polynomial-time attacks has been achieved
for user data via the deliberately `chaotic' compilation required for security
properties in that environment. Encrypted computing is the emerging science and
technology of processors that take encrypted inputs to encrypted outputs via
encrypted intermediate values (at nearly conventional speeds). The aim is to
make user data in general-purpose computing secure against the operator and
operating system as potential adversaries. A stumbling block has always been
that memory addresses are data and good encryption means the encrypted value
varies randomly, and that makes hitting any target in memory problematic
without address decryption, yet decryption anywhere on the memory path would
open up many easily exploitable vulnerabilities. This paper `solves (chaotic)
compilation' for processors without address decryption, covering all of ANSI C
while satisfying the required security properties and opening up the field for
the standard software tool-chain and infrastructure. That produces the argument
referred to above, which may also hold without encryption.Comment: 31 pages. Version update adds "Chaotic" in title and throughout
paper, and recasts abstract and Intro and other sections of the text for
better access by cryptologists. To the same end it introduces the polynomial
time defense argument explicitly in the final section, having now set that
denouement out in the abstract and intr
- …