4,551 research outputs found

    Less is More: Exploiting the Standard Compiler Optimization Levels for Better Performance and Energy Consumption

    Get PDF
    This paper presents the interesting observation that by performing fewer of the optimizations available in a standard compiler optimization level such as -O2, while preserving their original ordering, significant savings can be achieved in both execution time and energy consumption. This observation has been validated on two embedded processors, namely the ARM Cortex-M0 and the ARM Cortex-M3, using two different versions of the LLVM compilation framework; v3.8 and v5.0. Experimental evaluation with 71 embedded benchmarks demonstrated performance gains for at least half of the benchmarks for both processors. An average execution time reduction of 2.4% and 5.3% was achieved across all the benchmarks for the Cortex-M0 and Cortex-M3 processors, respectively, with execution time improvements ranging from 1% up to 90% over the -O2. The savings that can be achieved are in the same range as what can be achieved by the state-of-the-art compilation approaches that use iterative compilation or machine learning to select flags or to determine phase orderings that result in more efficient code. In contrast to these time consuming and expensive to apply techniques, our approach only needs to test a limited number of optimization configurations, less than 64, to obtain similar or even better savings. Furthermore, our approach can support multi-criteria optimization as it targets execution time, energy consumption and code size at the same time.Comment: 15 pages, 3 figures, 71 benchmarks used for evaluatio

    Lost in translation: Exposing hidden compiler optimization opportunities

    Get PDF
    Existing iterative compilation and machine-learning-based optimization techniques have been proven very successful in achieving better optimizations than the standard optimization levels of a compiler. However, they were not engineered to support the tuning of a compiler's optimizer as part of the compiler's daily development cycle. In this paper, we first establish the required properties which a technique must exhibit to enable such tuning. We then introduce an enhancement to the classic nightly routine testing of compilers which exhibits all the required properties, and thus, is capable of driving the improvement and tuning of the compiler's common optimizer. This is achieved by leveraging resource usage and compilation information collected while systematically exploiting prefixes of the transformations applied at standard optimization levels. Experimental evaluation using the LLVM v6.0.1 compiler demonstrated that the new approach was able to reveal hidden cross-architecture and architecture-dependent potential optimizations on two popular processors: the Intel i5-6300U and the Arm Cortex-A53-based Broadcom BCM2837 used in the Raspberry Pi 3B+. As a case study, we demonstrate how the insights from our approach enabled us to identify and remove a significant shortcoming of the CFG simplification pass of the LLVM v6.0.1 compiler.Comment: 31 pages, 7 figures, 2 table. arXiv admin note: text overlap with arXiv:1802.0984

    Inductive Logic Programming for Compiler Tuning

    Get PDF

    Chaotic Compilation for Encrypted Computing: Obfuscation but Not in Name

    Get PDF
    An `obfuscation' for encrypted computing is quantified exactly here, leading to an argument that security against polynomial-time attacks has been achieved for user data via the deliberately `chaotic' compilation required for security properties in that environment. Encrypted computing is the emerging science and technology of processors that take encrypted inputs to encrypted outputs via encrypted intermediate values (at nearly conventional speeds). The aim is to make user data in general-purpose computing secure against the operator and operating system as potential adversaries. A stumbling block has always been that memory addresses are data and good encryption means the encrypted value varies randomly, and that makes hitting any target in memory problematic without address decryption, yet decryption anywhere on the memory path would open up many easily exploitable vulnerabilities. This paper `solves (chaotic) compilation' for processors without address decryption, covering all of ANSI C while satisfying the required security properties and opening up the field for the standard software tool-chain and infrastructure. That produces the argument referred to above, which may also hold without encryption.Comment: 31 pages. Version update adds "Chaotic" in title and throughout paper, and recasts abstract and Intro and other sections of the text for better access by cryptologists. To the same end it introduces the polynomial time defense argument explicitly in the final section, having now set that denouement out in the abstract and intr
    corecore