944 research outputs found
Mitigating Software-Instrumentation Cache Effects in Measurement-Based Timing Analysis
Measurement-based timing analysis (MBTA) is often used to determine the timing behaviour of software programs embedded in safety-aware real-time systems deployed in various industrial domains including automotive and railway. MBTA methods rely on some form of instrumentation, either at hardware or software level, of the target program or fragments thereof to collect execution-time measurement data. A known drawback of software-level instrumentation is that instrumentation itself does affect the timing and functional behaviour of a program, resulting in the so-called probe effect: leaving the instrumentation code in the final executable can negatively affect average performance and could not be even admissible under stringent industrial qualification and certification standards; removing it before operation jeopardizes the results of timing analysis as the WCET estimates on the instrumented version of the program cannot be valid any more due, for example, to the timing effects incurred by different cache alignments. In this paper, we present a novel approach to mitigate the impact of instrumentation code on cache behaviour by reducing the instrumentation overhead while at the same time preserving and consolidating the results of timing analysis
A Survey of Techniques for Improving Security of GPUs
Graphics processing unit (GPU), although a powerful performance-booster, also
has many security vulnerabilities. Due to these, the GPU can act as a
safe-haven for stealthy malware and the weakest `link' in the security `chain'.
In this paper, we present a survey of techniques for analyzing and improving
GPU security. We classify the works on key attributes to highlight their
similarities and differences. More than informing users and researchers about
GPU security techniques, this survey aims to increase their awareness about GPU
security vulnerabilities and potential countermeasures
EPC Enacted: Integration in an Industrial Toolbox and Use against a Railway Application
Measurement-based timing analysis approaches are increasingly making their way into several industrial domains on account of their good cost-benefit ratio. The trustworthiness of those methods, however, suffers from the limitation that their results are only valid for the particular paths and execution conditions that the user is able to explore with the available input vectors. It is generally not possible to guarantee that the collected measurements are fully representative of the worst-case timing behaviour. In the context of measurement-based probabilistic timing analysis, the Extended Path Coverage (EPC) approach has been recently proposed as a means to extend the representativeness of measurement observations, to obtain the same effect of full path coverage. At the time of its first publication, EPC had not reached an implementation maturity that could be trialled industrially. In this work we analyze the practical implications of using EPC with real-world applications, and discuss the challenges in integrating it in an industrial-quality toolchain. We show that we were able to meet EPC requirements and successfully evaluate the technique on a real Railway application, on top of a commercial toolchain and full execution stack.This work has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] under
grant agreement 611085 (PROXIMA, www.proxima-project.eu). This work has also been partially supported by the Spanish
Ministry of Economy and Competitiveness (MINECO) under grant TIN2015-65316-P and the HiPEAC Network of Excellence. Jaume Abella has been partially supported by
the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717. The authors are grateful to Antoine Colin from Rapita Ltd. for his precious support.Peer ReviewedPostprint (author's final draft
Shining Light On Shadow Stacks
Control-Flow Hijacking attacks are the dominant attack vector against C/C++
programs. Control-Flow Integrity (CFI) solutions mitigate these attacks on the
forward edge,i.e., indirect calls through function pointers and virtual calls.
Protecting the backward edge is left to stack canaries, which are easily
bypassed through information leaks. Shadow Stacks are a fully precise mechanism
for protecting backwards edges, and should be deployed with CFI mitigations. We
present a comprehensive analysis of all possible shadow stack mechanisms along
three axes: performance, compatibility, and security. For performance
comparisons we use SPEC CPU2006, while security and compatibility are
qualitatively analyzed. Based on our study, we renew calls for a shadow stack
design that leverages a dedicated register, resulting in low performance
overhead, and minimal memory overhead, but sacrifices compatibility. We present
case studies of our implementation of such a design, Shadesmar, on Phoronix and
Apache to demonstrate the feasibility of dedicating a general purpose register
to a security monitor on modern architectures, and the deployability of
Shadesmar. Our comprehensive analysis, including detailed case studies for our
novel design, allows compiler designers and practitioners to select the correct
shadow stack design for different usage scenarios.Comment: To Appear in IEEE Security and Privacy 201
Tracing Hardware Monitors in the GR712RC Multicore Platform: Challenges and Lessons Learnt from a Space Case Study
The demand for increased computing performance is driving industry in critical-embedded systems (CES) domains, e.g. space, towards the use of multicores processors. Multicores, however, pose several challenges that must be addressed before their safe adoption in critical embedded domains. One of the prominent challenges is software timing analysis, a fundamental step in the verification and validation process. Monitoring and profiling solutions, traditionally used for debugging and optimization, are increasingly exploited for software timing in multicores. In particular, hardware event monitors related to requests to shared hardware resources are building block to assess and restraining multicore interference. Modern timing analysis techniques build on event monitors to track and control the contention tasks can generate each other in a multicore platform. In this paper we look into the hardware profiling problem from an industrial perspective and address both methodological and practical problems when monitoring a multicore application. We assess pros and cons of several profiling and tracing solutions, showing that several aspects need to be taken into account while considering the appropriate mechanism to collect and extract the profiling information from a multicore COTS platform. We address the profiling problem on a representative COTS platform for the aerospace domain to find that the availability of directly-accessible hardware counters is not a given, and it may be necessary to the develop specific tools that capture the needs of both the user’s and the timing analysis technique requirements. We report challenges in developing an event monitor tracing tool that works for bare-metal and RTEMS configurations and
show the accuracy of the developed tool-set in profiling a real aerospace application. We also show how the profiling tools can be exploited, together with handcrafted benchmarks, to characterize the application behavior in terms of multicore timing interference.This work has been partially supported by a collaboration agreement between Thales Research and the Barcelona Supercomputing Center, and the European Research Council (ERC) under the EU’s Horizon 2020 research and innovation programme (grant agreement No. 772773).
MINECO partially supported Jaume Abella under Ramon y Cajal postdoctoral fellowship (RYC2013-14717).Peer ReviewedPostprint (published version
Forecast-Based Interference : Modelling Multicore Interference from Observable Factors
While there is significant interest in the use of COTS multicore platforms for Real-time Systems, there has been very little in terms of practical methods to calculate the interference multiplier (i.e. the increase in execution time due to interference) between tasks on such systems. COTS multicore platforms present two distinct challenges: firstly, the variable interference between tasks competing for shared resources such as cache, and secondly the complexity of the hardware mechanisms and policies used, which may result in a system which is very difficult if not impossible to analyse; assuming that the exact details of the hardware are even disclosed! This paper proposes a new technique, Forecast-Based Interference analysis, which mitigates both of these issues by combining measurement-based techniques with statistical techniques and forecast modelling to enable the prediction of an interference multiplier for a given set of tasks, in an automated and reliable manner. The combination of execution times and interference multipliers can be used both in the design, e.g. for specifying timing watchdogs, and analysis, e.g. verifying schedulability
Design and Implementation of a Time Predictable Processor: Evaluation With a Space Case Study
Embedded real-time systems like those found in automotive, rail and aerospace, steadily require higher levels of guaranteed computing performance (and hence time predictability) motivated by the increasing number of functionalities provided by software. However, high-performance processor design is driven by the average-performance needs of mainstream market. To make things worse, changing those designs is hard since the embedded real-time market is comparatively a small market. A path to address this mismatch is designing low-complexity hardware features that favor time predictability and can be enabled/disabled not to affect average performance when performance guarantees are not required. In this line, we present the lessons learned designing and implementing LEOPARD, a four-core processor facilitating measurement-based timing analysis (widely used in most domains). LEOPARD has been designed adding low-overhead hardware mechanisms to a LEON3 processor baseline that allow capturing the impact of jittery resources (i.e. with variable latency) in the measurements performed at analysis time. In particular, at core level we handle the jitter of caches, TLBs and variable-latency floating point units; and at the chip level, we deal with contention so that time-composable timing guarantees can be obtained. The result of our applied study with a Space application shows how per-resource jitter is controlled facilitating the computation of high-quality WCET estimates
SMoTherSpectre: exploiting speculative execution through port contention
Spectre, Meltdown, and related attacks have demonstrated that kernels,
hypervisors, trusted execution environments, and browsers are prone to
information disclosure through micro-architectural weaknesses. However, it
remains unclear as to what extent other applications, in particular those that
do not load attacker-provided code, may be impacted. It also remains unclear as
to what extent these attacks are reliant on cache-based side channels.
We introduce SMoTherSpectre, a speculative code-reuse attack that leverages
port-contention in simultaneously multi-threaded processors (SMoTher) as a side
channel to leak information from a victim process. SMoTher is a fine-grained
side channel that detects contention based on a single victim instruction. To
discover real-world gadgets, we describe a methodology and build a tool that
locates SMoTher-gadgets in popular libraries. In an evaluation on glibc, we
found hundreds of gadgets that can be used to leak information. Finally, we
demonstrate proof-of-concept attacks against the OpenSSH server, creating
oracles for determining four host key bits, and against an application
performing encryption using the OpenSSL library, creating an oracle which can
differentiate a bit of the plaintext through gadgets in libcrypto and glibc
- …