11,311 research outputs found

    Embedded Program Annotations for WCET Analysis

    Get PDF
    We present __builtin_ais_annot(), a user-friendly, versatile way to transfer annotations (also known as flow facts) written on the source code level to the machine code level. To do so, we couple two tools often used during the development of safety-critical hard real-time systems, the formally verified C compiler CompCert and the static WCET analyzer aiT. CompCert stores the AIS annotations given via __builtin_ais_annot() in a special section of the ELF binary, which can later be extracted automatically by aiT

    ARM2GC: Succinct Garbled Processor for Secure Computation

    Get PDF
    We present ARM2GC, a novel secure computation framework based on Yao's Garbled Circuit (GC) protocol and the ARM processor. It allows users to develop privacy-preserving applications using standard high-level programming languages (e.g., C) and compile them using off-the-shelf ARM compilers (e.g., gcc-arm). The main enabler of this framework is the introduction of SkipGate, an algorithm that dynamically omits the communication and encryption cost of the gates whose outputs are independent of the private data. SkipGate greatly enhances the performance of ARM2GC by omitting costs of the gates associated with the instructions of the compiled binary, which is known by both parties involved in the computation. Our evaluation on benchmark functions demonstrates that ARM2GC not only outperforms the current GC frameworks that support high-level languages, it also achieves efficiency comparable to the best prior solutions based on hardware description languages. Moreover, in contrast to previous high-level frameworks with domain-specific languages and customized compilers, ARM2GC relies on standard ARM compiler which is rigorously verified and supports programs written in the standard syntax.Comment: 13 page

    Evaluating kernels on Xeon Phi to accelerate Gysela application

    Get PDF
    This work describes the challenges presented by porting parts ofthe Gysela code to the Intel Xeon Phi coprocessor, as well as techniques used for optimization, vectorization and tuning that can be applied to other applications. We evaluate the performance of somegeneric micro-benchmark on Phi versus Intel Sandy Bridge. Several interpolation kernels useful for the Gysela application are analyzed and the performance are shown. Some memory-bound and compute-bound kernels are accelerated by a factor 2 on the Phi device compared to Sandy architecture. Nevertheless, it is hard, if not impossible, to reach a large fraction of the peek performance on the Phi device,especially for real-life applications as Gysela. A collateral benefit of this optimization and tuning work is that the execution time of Gysela (using 4D advections) has decreased on a standard architecture such as Intel Sandy Bridge.Comment: submitted to ESAIM proceedings for CEMRACS 2014 summer school version reviewe

    Practical Sparse Matrices in C++ with Hybrid Storage and Template-Based Expression Optimisation

    Get PDF
    Despite the importance of sparse matrices in numerous fields of science, software implementations remain difficult to use for non-expert users, generally requiring the understanding of underlying details of the chosen sparse matrix storage format. In addition, to achieve good performance, several formats may need to be used in one program, requiring explicit selection and conversion between the formats. This can be both tedious and error-prone, especially for non-expert users. Motivated by these issues, we present a user-friendly and open-source sparse matrix class for the C++ language, with a high-level application programming interface deliberately similar to the widely used MATLAB language. This facilitates prototyping directly in C++ and aids the conversion of research code into production environments. The class internally uses two main approaches to achieve efficient execution: (i) a hybrid storage framework, which automatically and seamlessly switches between three underlying storage formats (compressed sparse column, Red-Black tree, coordinate list) depending on which format is best suited and/or available for specific operations, and (ii) a template-based meta-programming framework to automatically detect and optimise execution of common expression patterns. Empirical evaluations on large sparse matrices with various densities of non-zero elements demonstrate the advantages of the hybrid storage framework and the expression optimisation mechanism.Comment: extended and revised version of an earlier conference paper arXiv:1805.0338
    • …
    corecore