51,232 research outputs found

    Software timing analysis for complex hardware with survivability and risk analysis

    Get PDF
    The increasing automation of safety-critical real-time systems, such as those in cars and planes, leads, to more complex and performance-demanding on-board software and the subsequent adoption of multicores and accelerators. This causes software's execution time dispersion to increase due to variable-latency resources such as caches, NoCs, advanced memory controllers and the like. Statistical analysis has been proposed to model the Worst-Case Execution Time (WCET) of software running such complex systems by providing reliable probabilistic WCET (pWCET) estimates. However, statistical models used so far, which are based on risk analysis, are overly pessimistic by construction. In this paper we prove that statistical survivability and risk analyses are equivalent in terms of tail analysis and, building upon survivability analysis theory, we show that Weibull tail models can be used to estimate pWCET distributions reliably and tightly. In particular, our methodology proves the correctness-by-construction of the approach, and our evaluation provides evidence about the tightness of the pWCET estimates obtained, which allow decreasing them reliably by 40% for a railway case study w.r.t. state-of-the-art exponential tails.This work is a collaboration between Argonne National Laboratory and the Barcelona Supercomputing Center within the Joint Laboratory for Extreme-Scale Computing. This research is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC02- 06CH11357, program manager Laura Biven, and by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    A confidence assessment of WCET estimates for software time randomized caches

    Get PDF
    Obtaining Worst-Case Execution Time (WCET) estimates is a required step in real-time embedded systems during software verification. Measurement-Based Probabilistic Timing Analysis (MBPTA) aims at obtaining WCET estimates for industrial-size software running upon hardware platforms comprising high-performance features. MBPTA relies on the randomization of timing behavior (functional behavior is left unchanged) of hard-to-predict events like the location of objects in memory — and hence their associated cache behavior — that significantly impact software's WCET estimates. Software time-randomized caches (sTRc) have been recently proposed to enable MBPTA on top of Commercial off-the-shelf (COTS) caches (e.g. modulo placement). However, some random events may challenge MBPTA reliability on top of sTRc. In this paper, for sTRc and programs with homogeneously accessed addresses, we determine whether the number of observations taken at analysis, as part of the normal MBPTA application process, captures the cache events significantly impacting execution time and WCET. If this is not the case, our techniques provide the user with the number of extra runs to perform to guarantee that cache events are captured for a reliable application of MBPTA. Our techniques are evaluated with synthetic benchmarks and an avionics application.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] under the PROXIMA Project (www.proxima-project.eu), grant agreement no 611085. This work has also been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2015-65316, the HiPEAC Network of Excellence, and COST Action IC1202: Timing Analysis On Code-Level (TACLe). Jaume Abella has been partially supported by the Ministry of Economy and Competitiveness under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft

    Modelling probabilistic cache representativeness in the presence of arbitrary access patterns

    Get PDF
    Measurement-Based Probabilistic Timing Analysis (MBPTA) is a promising powerful industry-friendly method to derive worst-case execution time (WCET) estimates as needed for critical real-time embedded systems. MBPTA performs several (R) runs of the program on the target platform collecting the execution times in each run. MBPTA builds a probabilistic representativeness argument on whether those events with high impact on execution time, such as cache misses, arise on the runs made at analysis time so that their impact on execution time is captured. So far only events occurring in cache memories have been shown to challenge providing such representativeness argument. In this context, this paper introduces a representativeness validation method (RVS) to assess the probabilistic representativeness of MBPTA’s execution time observations in terms of cache behaviour. RVS resorts to cache simulation to predict worst-case miss scenarios that can appear during the deployment phase. RVS also constructs a probabilistic Worst-Case Miss Count curve based on the miss-counts captured in the R runs. If that curve upperbounds the impact of the predicted cache worst-case scenarios, R is deemed as a sufficient number of runs for which pWCET estimates can be reliably derived. Otherwise, the user is requested to perform more runs until all cache scenarios of interest are captured.Peer ReviewedPostprint (author's final draft

    A software controlled voltage tuning system using multi-purpose ring oscillators

    Full text link
    This paper presents a novel software driven voltage tuning method that utilises multi-purpose Ring Oscillators (ROs) to provide process variation and environment sensitive energy reductions. The proposed technique enables voltage tuning based on the observed frequency of the ROs, taken as a representation of the device speed and used to estimate a safe minimum operating voltage at a given core frequency. A conservative linear relationship between RO frequency and silicon speed is used to approximate the critical path of the processor. Using a multi-purpose RO not specifically implemented for critical path characterisation is a unique approach to voltage tuning. The parameters governing the relationship between RO and silicon speed are obtained through the testing of a sample of processors from different wafer regions. These parameters can then be used on all devices of that model. The tuning method and software control framework is demonstrated on a sample of XMOS XS1-U8A-64 embedded microprocessors, yielding a dynamic power saving of up to 25% with no performance reduction and no negative impact on the real-time constraints of the embedded software running on the processor

    Cross-layer system reliability assessment framework for hardware faults

    Get PDF
    System reliability estimation during early design phases facilitates informed decisions for the integration of effective protection mechanisms against different classes of hardware faults. When not all system abstraction layers (technology, circuit, microarchitecture, software) are factored in such an estimation model, the delivered reliability reports must be excessively pessimistic and thus lead to unacceptably expensive, over-designed systems. We propose a scalable, cross-layer methodology and supporting suite of tools for accurate but fast estimations of computing systems reliability. The backbone of the methodology is a component-based Bayesian model, which effectively calculates system reliability based on the masking probabilities of individual hardware and software components considering their complex interactions. Our detailed experimental evaluation for different technologies, microarchitectures, and benchmarks demonstrates that the proposed model delivers very accurate reliability estimations (FIT rates) compared to statistically significant but slow fault injection campaigns at the microarchitecture level.Peer ReviewedPostprint (author's final draft

    High-Integrity Performance Monitoring Units in Automotive Chips for Reliable Timing V&V

    Get PDF
    As software continues to control more system-critical functions in cars, its timing is becoming an integral element in functional safety. Timing validation and verification (V&V) assesses softwares end-to-end timing measurements against given budgets. The advent of multicore processors with massive resource sharing reduces the significance of end-to-end execution times for timing V&V and requires reasoning on (worst-case) access delays on contention-prone hardware resources. While Performance Monitoring Units (PMU) support this finer-grained reasoning, their design has never been a prime consideration in high-performance processors - where automotive-chips PMU implementations descend from - since PMU does not directly affect performance or reliability. To meet PMUs instrumental importance for timing V&V, we advocate for PMUs in automotive chips that explicitly track activities related to worst-case (rather than average) softwares behavior, are recognized as an ISO-26262 mandatory high-integrity hardware service, and are accompanied with detailed documentation that enables their effective use to derive reliable timing estimatesThis work has also been partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under grant TIN2015-65316-P and the HiPEAC Network of Excellence. Jaume Abella has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717. Enrico Mezzet has been partially supported by the Spanish Ministry of Economy and Competitiveness under Juan de la Cierva-IncorporaciĂłn postdoctoral fellowship number IJCI-2016- 27396.Peer ReviewedPostprint (author's final draft
    • …
    corecore