158 research outputs found

    TimeWeaver: A Tool for Hybrid Worst-Case Execution Time Analysis

    Get PDF
    Many embedded control applications have real-time requirements. If the application is safety-relevant, worst-case execution time bounds have to be determined in order to demonstrate deadline adherence. For high-performance multi-core architectures with degraded timing predictability, WCET bounds can be computed by hybrid WCET analysis which combines static analysis with timing measurements. This article focuses on a novel tool for hybrid WCET analysis based on non-intrusive instruction-level real-time tracing

    Continuous Non-Intrusive Hybrid WCET Estimation Using Waypoint Graphs

    Get PDF
    Traditionally, the Worst-Case Execution Time (WCET) of Embedded Software has been estimated using analytical approaches. This is effective, if good models of the processor/System-on-Chip (SoC) architecture exist. Unfortunately, modern high performance SoCs often contain unpredictable and/or undocumented components that influence the timing behaviour. Thus, analytical results for such processors are unrealistically pessimistic. One possible alternative approach seems to be hybrid WCET analysis, where measurement data together with an analytical approach is used to estimate worst-case behaviour. Previously, we demonstrated how continuous evaluation of basic block trace data can be used to produce detailed statistics of basic blocks in embedded software. In the meantime it has become clear that the trace data provided by modern SoCs delivers a different type of information. In this contribution, we show that even under realistic conditions, a meaningful analysis can be conducted with the trace data

    Estimating the WCET of GPU-Accelerated Applications using Hybrid Analysis

    No full text

    Dynamic Memory Bandwidth Allocation for Real-Time GPU-Based SoC Platforms

    Get PDF
    Heterogeneous SoC platforms, comprising both general purpose CPUs and accelerators such as a GPU, are becoming increasingly attractive for real-time and mixed-criticality systems to cope with the computational demand of data parallel applications. However, contention for access to shared main memory can lead to significant performance degradation on both CPU and GPU. Existing work has shown that memory bandwidth throttling is effective in protecting real-time applications from memory-intensive, best-effort ones; however, due to the inherent pessimism involved in worst-case execution time estimation, such approaches can unduly restrict the bandwidth available to best-effort applications. In this work, we propose a novel memory bandwidth allocation scheme where we dynamically monitor the progress of a real-time application and increase the bandwidth share of best-effort ones whenever it is safe to do so. Specifically, we demonstrate our approach by protecting a real-time GPU kernel from best-effort CPU tasks. Based on profiling information, we first build a worst case execution time estimation model for the GPU kernel. Using such model, we then show how to dynamically recompute on-line the maximum memory budget that can be allocated to best-effort tasks without exceeding the kernel’s assigned execution budget. We implement our proposed technique on NVIDIA embedded SoC and demonstrate its effectiveness on a variety of GPU and CPU benchmarks

    Towards Multicore WCET Analysis

    Get PDF
    AbsInt is the leading provider of commercial tools for static code-level timing analysis. Its aiT Worst-Case Execution Time Analyzer computes tight bounds for the WCET of tasks in embedded real-time systems. However, the results only incorporate the core-local latencies, i.e. interference delays due to other cores in a multicore system are ignored. This paper presents some of the work we have done towards multicore WCET analysis. We look into both static and measurement-based timing analysis for COTS multicore systems

    Tracing Hardware Monitors in the GR712RC Multicore Platform: Challenges and Lessons Learnt from a Space Case Study

    Get PDF
    The demand for increased computing performance is driving industry in critical-embedded systems (CES) domains, e.g. space, towards the use of multicores processors. Multicores, however, pose several challenges that must be addressed before their safe adoption in critical embedded domains. One of the prominent challenges is software timing analysis, a fundamental step in the verification and validation process. Monitoring and profiling solutions, traditionally used for debugging and optimization, are increasingly exploited for software timing in multicores. In particular, hardware event monitors related to requests to shared hardware resources are building block to assess and restraining multicore interference. Modern timing analysis techniques build on event monitors to track and control the contention tasks can generate each other in a multicore platform. In this paper we look into the hardware profiling problem from an industrial perspective and address both methodological and practical problems when monitoring a multicore application. We assess pros and cons of several profiling and tracing solutions, showing that several aspects need to be taken into account while considering the appropriate mechanism to collect and extract the profiling information from a multicore COTS platform. We address the profiling problem on a representative COTS platform for the aerospace domain to find that the availability of directly-accessible hardware counters is not a given, and it may be necessary to the develop specific tools that capture the needs of both the user’s and the timing analysis technique requirements. We report challenges in developing an event monitor tracing tool that works for bare-metal and RTEMS configurations and show the accuracy of the developed tool-set in profiling a real aerospace application. We also show how the profiling tools can be exploited, together with handcrafted benchmarks, to characterize the application behavior in terms of multicore timing interference.This work has been partially supported by a collaboration agreement between Thales Research and the Barcelona Supercomputing Center, and the European Research Council (ERC) under the EU’s Horizon 2020 research and innovation programme (grant agreement No. 772773). MINECO partially supported Jaume Abella under Ramon y Cajal postdoctoral fellowship (RYC2013-14717).Peer ReviewedPostprint (published version

    WE-HML: hybrid WCET estimation using machine learning for architectures with caches

    Get PDF
    International audienceModern processors raise a challenge for WCET estimation, since detailed knowledge of the processor microarchitecture is not available. This paper proposes a novel hybrid WCET estimation technique, WE-HML, in which the longest path is estimated using static techniques, whereas machine learning (ML) is used to determine the WCET of basic blocks. In contrast to existing literature using ML techniques for WCET estimation, WE-HML (i) operates on binary code for improved precision of learning, as compared to the related techniques operating at source code or intermediate code level; (ii) trains the ML algorithms on a large set of automatically generated programs for improved quality of learning; (iii) proposes a technique to take into account data caches. Experiments on an ARM Cortex-A53 processor show that for all benchmarks, WCET estimates obtained by WE-HML are larger than all possible execution times. Moreover, the cache modeling technique of WE-HML allows an improvement of 65 percent on average of WCET estimates compared to its cache-agnostic equivalent

    CAWET: Context-Aware Worst-Case Execution Time Estimation Using Transformers

    Get PDF
    This paper presents CAWET, a hybrid worst-case program timing estimation technique. CAWET identifies the longest execution path using static techniques, whereas the worst-case execution time (WCET) of basic blocks is predicted using an advanced language processing technique called Transformer-XL. By employing Transformers-XL in CAWET, the execution context formed by previously executed basic blocks is taken into account, allowing for consideration of the micro-architecture of the processor pipeline without explicit modeling. Through a series of experiments on the TacleBench benchmarks, using different target processors (Arm Cortex M4, M7, and A53), our method is demonstrated to never underestimate WCETs and is shown to be less pessimistic than its competitors
    • 

    corecore