5 research outputs found

    Modelling and predicting extreme behavior in critical real-time systems with advanced statistics

    Get PDF
    In the last decade, the market for Critical Real-Time Embedded Systems (CRTES) has increased significantly. According to Global Markets Insight [1], the embedded systems market will reach a total size of US $258 billion in 2023 at an average annual growth rate of 5.6%. Their extensive use in domains such as automotive, aerospace and avionics industry demands ever increasing performance requirements [2]. To satisfy those requirements the CRTES industry has implemented more complex processors, a higher number of memory modules, and accelerators units. Thus the demanding performance requirements have led to a merge of CRTES with High Performance systems. All of these industries work within the framework of CRTES, which puts several restrictions in their design and implementation. Real Time systems require to deliver a response to an event in a restricted time frame or deadline. Real-time systems where missing a deadline provokes a total system failure (hard real-time systems) need satisfy certain guidelines and standards to show that they comply with test for functional and timing behaviour. These standards change depending on the industry, for instance the automotive industry follows ISO 26262 [3] and the aerospace industry follows DO-178C [4]. Researches have developed techniques to analyse the timing correctness in a CRTES. Here, we will expose how they perform on the estimation of the Worst-Case Execution Time (WCET). The WCET is the maximum time that a particular software takes to execute. Estimating its value is crucial from a timing analysis point of view. However there is still not a generalised precise and safe method to produce estimates of WCET [5]. In the CRTES the estimations of the WCET cannot be lower than the true WCET, as they are deemed unsafe; but they cannot exceed it by a significant margin, as they will be deemed pessimistic and impractical

    Software timing analysis for complex hardware with survivability and risk analysis

    Get PDF
    The increasing automation of safety-critical real-time systems, such as those in cars and planes, leads, to more complex and performance-demanding on-board software and the subsequent adoption of multicores and accelerators. This causes software's execution time dispersion to increase due to variable-latency resources such as caches, NoCs, advanced memory controllers and the like. Statistical analysis has been proposed to model the Worst-Case Execution Time (WCET) of software running such complex systems by providing reliable probabilistic WCET (pWCET) estimates. However, statistical models used so far, which are based on risk analysis, are overly pessimistic by construction. In this paper we prove that statistical survivability and risk analyses are equivalent in terms of tail analysis and, building upon survivability analysis theory, we show that Weibull tail models can be used to estimate pWCET distributions reliably and tightly. In particular, our methodology proves the correctness-by-construction of the approach, and our evaluation provides evidence about the tightness of the pWCET estimates obtained, which allow decreasing them reliably by 40% for a railway case study w.r.t. state-of-the-art exponential tails.This work is a collaboration between Argonne National Laboratory and the Barcelona Supercomputing Center within the Joint Laboratory for Extreme-Scale Computing. This research is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC02- 06CH11357, program manager Laura Biven, and by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    Using Markov’s inequality with power-of-k function for probabilistic WCET estimation

    Get PDF
    Deriving WCET estimates for software programs with probabilistic means (a.k.a. pWCET estimation) has received significant attention during last years as a way to deal with the increased complexity of the processors used in real-time systems. Many works build on Extreme Value Theory (EVT) that is fed with a sample of the collected data (execution times). In its application, EVT carries two sources of uncertainty: the first one that is intrinsic to the EVT model and relates to determining the subset of the sample that belongs to the (upper) tail, and hence, is actually used by EVT for prediction; and the second one that is induced by the sampling process and hence is inherent to all sample-based methods. In this work, we show that Markov’s inequality can be used to obtain provable trustworthy probabilistic bounds to the tail of a distribution without incurring any model-intrinsic uncertainty. Yet, it produces pessimistic estimates that we shave substantially by proposing the use of a power-of-k function instead of the default identity function used by Markov’s inequality. Lastly, we propose a method to deal with sampling uncertainty for Markov’s inequality that consistently improves EVT estimates on synthetic and real data obtained from a railway application.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under grant PID2019-110854RB-I00 / AEI / 10.13039/501100011033 and the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 772773).Peer ReviewedPostprint (published version

    HRM: merging hardware event monitors for improved timing analysis of complex MPSoCs

    Get PDF
    The Performance Monitoring Unit (PMU) in MPSoCs is at the heart of the latest measurement-based timing analysis techniques in Critical Embedded Systems. In particular, hardware event monitors (HEMs) in the PMU are used as building blocks in the process of budgeting and verifying software timing by tracking and controlling access counts to shared resources. While the number of HEMs in current MPSoCs reaches hundreds, they are read via Performance Monitoring Counters whose number is usually limited to 4-8, thus requiring multiple runs of each experiment in order to collect all desired HEMs. Despite the effort of engineers in controlling the execution conditions of each experiment, the complexity of current MPSoCs makes it arguably impossible to completely remove the noise affecting each run. As a result, HEMs read in different runs are subject to different variability, and hence, those HEMs captured in different runs cannot be ‘blindly’ merged. In this work, we focus on the NXP T2080 platform where we observed up to 59% variability across different runs of the same experiment for some relevant HEMs (e.g. processor cycles). We develop a HEM reading and merging (HRM) approach to join reliably HEMs across different runs as a fundamental element of any measurement-based timing budgeting and verification technique. Our method builds on order statistics and the selection of an anchor HEM read in all runs to derive the most plausible combination of HEM readings that keep the distribution of each HEM and their relationship with the anchor HEM intact.This work has been partially supported by the Spanish Ministry of Science and Innovation under grant PID2019-107255GB, the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 772773) and the HiPEAC Network of Excellence.Peer ReviewedPostprint (author's final draft

    MUCH: exploiting pairwise hardware event monitor correlations for improved timing analysis of complex MPSoCs

    Get PDF
    Measurement-based timing analysis techniques increasingly rely on the Performance Monitoring Units (PMU) of MPSoCs, as these units implement specialized Hardware Event Monitors (HEMs) that convey detailed information about multicore interference in hardware shared resources. Unfortunately, there is an evident mismatch between the large number of HEMs (typically several hundreds) and the comparatively small number (normally less than ten) of Performance Monitoring Counters (PMCs) that can be configured to track HEMs in the PMU. Timing analysis normally require to observe a non-negligible number of HEMs per task from the same execution. However, due to the small number of PMCs, HEMs are necessarily collected across multiple runs that, despite intended to repeat the same experiment, carry out some significant variability (above 50% for some HEMs in relevant MPSoCs) caused by platform-intrinsic execution conditions. Therefore, blindly merging HEMs from different runs is not acceptable since they may easily correspond to significantly different conditions. To tackle this issue, the HRM approach has been proposed recently to merge HEMs from different runs accurately preserving their correlation w.r.t. one anchor HEM (i.e. processor cycles) building on order statistics. However, HRM do not always preserves the correlation between other pairs of HEMs that might be lost to a large extent. This paper copes with HRM limitations by proposing the MUlti-Correlation HEM reading and merging approach (MUCH). MUCH builds on multivariate Gaussian distributions to merge HEMs from different runs while preserving pairwise correlations across each individual pair of HEMs simultaneously. Our results on an NXP T2080 MPSoC used for avionics systems show that MUCH largely outperforms HRM for an identical number of input runs.This work has been partially supported by the Spanish Ministry of Science and Innovation under grant PID2019-107255GB, the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 772773) and the HiPEAC Network of Excellence.Peer ReviewedPostprint (author's final draft
    corecore