944 research outputs found

    Mitigating Software-Instrumentation Cache Effects in Measurement-Based Timing Analysis

    Get PDF
    Measurement-based timing analysis (MBTA) is often used to determine the timing behaviour of software programs embedded in safety-aware real-time systems deployed in various industrial domains including automotive and railway. MBTA methods rely on some form of instrumentation, either at hardware or software level, of the target program or fragments thereof to collect execution-time measurement data. A known drawback of software-level instrumentation is that instrumentation itself does affect the timing and functional behaviour of a program, resulting in the so-called probe effect: leaving the instrumentation code in the final executable can negatively affect average performance and could not be even admissible under stringent industrial qualification and certification standards; removing it before operation jeopardizes the results of timing analysis as the WCET estimates on the instrumented version of the program cannot be valid any more due, for example, to the timing effects incurred by different cache alignments. In this paper, we present a novel approach to mitigate the impact of instrumentation code on cache behaviour by reducing the instrumentation overhead while at the same time preserving and consolidating the results of timing analysis

    A Survey of Techniques for Improving Security of GPUs

    Full text link
    Graphics processing unit (GPU), although a powerful performance-booster, also has many security vulnerabilities. Due to these, the GPU can act as a safe-haven for stealthy malware and the weakest `link' in the security `chain'. In this paper, we present a survey of techniques for analyzing and improving GPU security. We classify the works on key attributes to highlight their similarities and differences. More than informing users and researchers about GPU security techniques, this survey aims to increase their awareness about GPU security vulnerabilities and potential countermeasures

    EPC Enacted: Integration in an Industrial Toolbox and Use against a Railway Application

    Get PDF
    Measurement-based timing analysis approaches are increasingly making their way into several industrial domains on account of their good cost-benefit ratio. The trustworthiness of those methods, however, suffers from the limitation that their results are only valid for the particular paths and execution conditions that the user is able to explore with the available input vectors. It is generally not possible to guarantee that the collected measurements are fully representative of the worst-case timing behaviour. In the context of measurement-based probabilistic timing analysis, the Extended Path Coverage (EPC) approach has been recently proposed as a means to extend the representativeness of measurement observations, to obtain the same effect of full path coverage. At the time of its first publication, EPC had not reached an implementation maturity that could be trialled industrially. In this work we analyze the practical implications of using EPC with real-world applications, and discuss the challenges in integrating it in an industrial-quality toolchain. We show that we were able to meet EPC requirements and successfully evaluate the technique on a real Railway application, on top of a commercial toolchain and full execution stack.This work has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] under grant agreement 611085 (PROXIMA, www.proxima-project.eu). This work has also been partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under grant TIN2015-65316-P and the HiPEAC Network of Excellence. Jaume Abella has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717. The authors are grateful to Antoine Colin from Rapita Ltd. for his precious support.Peer ReviewedPostprint (author's final draft

    Shining Light On Shadow Stacks

    Full text link
    Control-Flow Hijacking attacks are the dominant attack vector against C/C++ programs. Control-Flow Integrity (CFI) solutions mitigate these attacks on the forward edge,i.e., indirect calls through function pointers and virtual calls. Protecting the backward edge is left to stack canaries, which are easily bypassed through information leaks. Shadow Stacks are a fully precise mechanism for protecting backwards edges, and should be deployed with CFI mitigations. We present a comprehensive analysis of all possible shadow stack mechanisms along three axes: performance, compatibility, and security. For performance comparisons we use SPEC CPU2006, while security and compatibility are qualitatively analyzed. Based on our study, we renew calls for a shadow stack design that leverages a dedicated register, resulting in low performance overhead, and minimal memory overhead, but sacrifices compatibility. We present case studies of our implementation of such a design, Shadesmar, on Phoronix and Apache to demonstrate the feasibility of dedicating a general purpose register to a security monitor on modern architectures, and the deployability of Shadesmar. Our comprehensive analysis, including detailed case studies for our novel design, allows compiler designers and practitioners to select the correct shadow stack design for different usage scenarios.Comment: To Appear in IEEE Security and Privacy 201

    Tracing Hardware Monitors in the GR712RC Multicore Platform: Challenges and Lessons Learnt from a Space Case Study

    Get PDF
    The demand for increased computing performance is driving industry in critical-embedded systems (CES) domains, e.g. space, towards the use of multicores processors. Multicores, however, pose several challenges that must be addressed before their safe adoption in critical embedded domains. One of the prominent challenges is software timing analysis, a fundamental step in the verification and validation process. Monitoring and profiling solutions, traditionally used for debugging and optimization, are increasingly exploited for software timing in multicores. In particular, hardware event monitors related to requests to shared hardware resources are building block to assess and restraining multicore interference. Modern timing analysis techniques build on event monitors to track and control the contention tasks can generate each other in a multicore platform. In this paper we look into the hardware profiling problem from an industrial perspective and address both methodological and practical problems when monitoring a multicore application. We assess pros and cons of several profiling and tracing solutions, showing that several aspects need to be taken into account while considering the appropriate mechanism to collect and extract the profiling information from a multicore COTS platform. We address the profiling problem on a representative COTS platform for the aerospace domain to find that the availability of directly-accessible hardware counters is not a given, and it may be necessary to the develop specific tools that capture the needs of both the user’s and the timing analysis technique requirements. We report challenges in developing an event monitor tracing tool that works for bare-metal and RTEMS configurations and show the accuracy of the developed tool-set in profiling a real aerospace application. We also show how the profiling tools can be exploited, together with handcrafted benchmarks, to characterize the application behavior in terms of multicore timing interference.This work has been partially supported by a collaboration agreement between Thales Research and the Barcelona Supercomputing Center, and the European Research Council (ERC) under the EU’s Horizon 2020 research and innovation programme (grant agreement No. 772773). MINECO partially supported Jaume Abella under Ramon y Cajal postdoctoral fellowship (RYC2013-14717).Peer ReviewedPostprint (published version

    Forecast-Based Interference : Modelling Multicore Interference from Observable Factors

    Get PDF
    While there is significant interest in the use of COTS multicore platforms for Real-time Systems, there has been very little in terms of practical methods to calculate the interference multiplier (i.e. the increase in execution time due to interference) between tasks on such systems. COTS multicore platforms present two distinct challenges: firstly, the variable interference between tasks competing for shared resources such as cache, and secondly the complexity of the hardware mechanisms and policies used, which may result in a system which is very difficult if not impossible to analyse; assuming that the exact details of the hardware are even disclosed! This paper proposes a new technique, Forecast-Based Interference analysis, which mitigates both of these issues by combining measurement-based techniques with statistical techniques and forecast modelling to enable the prediction of an interference multiplier for a given set of tasks, in an automated and reliable manner. The combination of execution times and interference multipliers can be used both in the design, e.g. for specifying timing watchdogs, and analysis, e.g. verifying schedulability

    Design and Implementation of a Time Predictable Processor: Evaluation With a Space Case Study

    Get PDF
    Embedded real-time systems like those found in automotive, rail and aerospace, steadily require higher levels of guaranteed computing performance (and hence time predictability) motivated by the increasing number of functionalities provided by software. However, high-performance processor design is driven by the average-performance needs of mainstream market. To make things worse, changing those designs is hard since the embedded real-time market is comparatively a small market. A path to address this mismatch is designing low-complexity hardware features that favor time predictability and can be enabled/disabled not to affect average performance when performance guarantees are not required. In this line, we present the lessons learned designing and implementing LEOPARD, a four-core processor facilitating measurement-based timing analysis (widely used in most domains). LEOPARD has been designed adding low-overhead hardware mechanisms to a LEON3 processor baseline that allow capturing the impact of jittery resources (i.e. with variable latency) in the measurements performed at analysis time. In particular, at core level we handle the jitter of caches, TLBs and variable-latency floating point units; and at the chip level, we deal with contention so that time-composable timing guarantees can be obtained. The result of our applied study with a Space application shows how per-resource jitter is controlled facilitating the computation of high-quality WCET estimates

    SMoTherSpectre: exploiting speculative execution through port contention

    Full text link
    Spectre, Meltdown, and related attacks have demonstrated that kernels, hypervisors, trusted execution environments, and browsers are prone to information disclosure through micro-architectural weaknesses. However, it remains unclear as to what extent other applications, in particular those that do not load attacker-provided code, may be impacted. It also remains unclear as to what extent these attacks are reliant on cache-based side channels. We introduce SMoTherSpectre, a speculative code-reuse attack that leverages port-contention in simultaneously multi-threaded processors (SMoTher) as a side channel to leak information from a victim process. SMoTher is a fine-grained side channel that detects contention based on a single victim instruction. To discover real-world gadgets, we describe a methodology and build a tool that locates SMoTher-gadgets in popular libraries. In an evaluation on glibc, we found hundreds of gadgets that can be used to leak information. Finally, we demonstrate proof-of-concept attacks against the OpenSSH server, creating oracles for determining four host key bits, and against an application performing encryption using the OpenSSL library, creating an oracle which can differentiate a bit of the plaintext through gadgets in libcrypto and glibc
    corecore