158 research outputs found

    WCET analysis of multi-level set-associative instruction caches

    Get PDF
    With the advent of increasingly complex hardware in real-time embedded systems (processors with performance enhancing features such as pipelines, cache hierarchy, multiple cores), many processors now have a set-associative L2 cache. Thus, there is a need for considering cache hierarchies when validating the temporal behavior of real-time systems, in particular when estimating tasks' worst-case execution times (WCETs). To the best of our knowledge, there is only one approach for WCET estimation for systems with cache hierarchies [Mueller, 1997], which turns out to be unsafe for set-associative caches. In this paper, we highlight the conditions under which the approach described in [Mueller, 1997] is unsafe. A safe static instruction cache analysis method is then presented. Contrary to [Mueller, 1997] our method supports set-associative and fully associative caches. The proposed method is experimented on medium-size and large programs. We show that the method is most of the time tight. We further show that in all cases WCET estimations are much tighter when considering the cache hierarchy than when considering only the L1 cache. An evaluation of the analysis time is conducted, demonstrating that analysing the cache hierarchy has a reasonable computation time

    Accurate analysis of memory latencies for WCET estimation

    Get PDF
    International audienceThese last years, many researchers have proposed solutions to estimate the Worst-Case Execution Time of a critical application when it is run on modern hardware. Several schemes commonly implemented to improve performance have been considered so far in the context of static WCET analysis: pipelines, instruction caches, dynamic branch predictors, execution cores supporting out-of-order execution, etc. Comparatively, components that are external to the processor have received lesser attention. In particular, the latency of memory accesses is generally considered as a fixed value. Now, modern DRAM devices support the open page policy that reduces the memory latency when successive memory accesses address the same memory row. This scheme, also known as row buffer, induces variable memory latencies, depending on whether the access hits or misses in the row buffer. In this paper, we propose an algorithm to take the open page policy into account when estimating WCETs for a processor with an instruction cache. Experimental results show that WCET estimates are refined thanks to the consideration of tighter memory latencies instead of pessimistic values

    Automatic Safe Data Reuse Detection for the WCET Analysis of Systems With Data Caches

    Get PDF
    Worst-case execution time (WCET) analysis of systems with data caches is one of the key challenges in real-time systems. Caches exploit the inherent reuse properties of programs, temporarily storing certain memory contents near the processor, in order that further accesses to such contents do not require costly memory transfers. Current worst-case data cache analysis methods focus on specific cache organizations (LRU, locked, ACDC, etc.). In this article, we analyze data reuse (in the worst case) as a property of the program, and thus independent of the data cache. Our analysis method uses Abstract Interpretation on the compiled program to extract, for each static load/store instruction, a linear expression for the address pattern of its data accesses, according to the Loop Nest Data Reuse Theory. Each data access expression is compared to that of prior (dominant) memory instructions to verify whether it presents a guaranteed reuse. Our proposal manages references to scalars, arrays, and non-linear accesses, provides both temporal and spatial reuse information, and does not require the exploration of explicit data access sequences. As a proof of concept we analyze the TACLeBench benchmark suite, showing that most loads/stores present data reuse, and how compiler optimizations affect it. Using a simple hit/miss estimation on our reuse results, the time devoted to data accesses in the worst case is reduced to 27% compared to an always-miss system, equivalent to a data hit ratio of 81%. With compiler optimization, such time is reduced to 6.5%

    Software timing analysis for complex hardware with survivability and risk analysis

    Get PDF
    The increasing automation of safety-critical real-time systems, such as those in cars and planes, leads, to more complex and performance-demanding on-board software and the subsequent adoption of multicores and accelerators. This causes software's execution time dispersion to increase due to variable-latency resources such as caches, NoCs, advanced memory controllers and the like. Statistical analysis has been proposed to model the Worst-Case Execution Time (WCET) of software running such complex systems by providing reliable probabilistic WCET (pWCET) estimates. However, statistical models used so far, which are based on risk analysis, are overly pessimistic by construction. In this paper we prove that statistical survivability and risk analyses are equivalent in terms of tail analysis and, building upon survivability analysis theory, we show that Weibull tail models can be used to estimate pWCET distributions reliably and tightly. In particular, our methodology proves the correctness-by-construction of the approach, and our evaluation provides evidence about the tightness of the pWCET estimates obtained, which allow decreasing them reliably by 40% for a railway case study w.r.t. state-of-the-art exponential tails.This work is a collaboration between Argonne National Laboratory and the Barcelona Supercomputing Center within the Joint Laboratory for Extreme-Scale Computing. This research is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC02- 06CH11357, program manager Laura Biven, and by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    Impact of DM-LRU on WCET: A Static Analysis Approach

    Get PDF
    Cache memories in modern embedded processors are known to improve average memory access performance. Unfortunately, they are also known to represent a major source of unpredictability for hard real-time workload. One of the main limitations of typical caches is that content selection and replacement is entirely performed in hardware. As such, it is hard to control the cache behavior in software to favor caching of blocks that are known to have an impact on an application\u27s worst-case execution time (WCET). In this paper, we consider a cache replacement policy, namely DM-LRU, that allows system designers to prioritize caching of memory blocks that are known to have an important impact on an application\u27s WCET. Considering a single-core, single-level cache hierarchy, we describe an abstract interpretation-based timing analysis for DM-LRU. We implement the proposed analysis in a self-contained toolkit and study its qualitative properties on a set of representative benchmarks. Apart from being useful to compute the WCET when DM-LRU or similar policies are used, the proposed analysis can allow designers to perform WCET impact-aware selection of content to be retained in cache

    On assessing the viability of probabilistic scheduling with dependent tasks

    Get PDF
    Despite the significant interest, in the last years, in probabilistic scheduling and probabilistic timing analysis, the interrelation between them has been scarcely addressed. Probabilistic scheduling approaches typically build on a series of assumptions on the probabilistic behavior of each task - or single jobs activations - that have not been shown to be entirely fulfilled by the distributions computed with probabilistic timing analysis. This paper aims at providing a clear understanding of probabilistic Worst-Case Execution Time distributions (pWCET) as a common concept of probabilistic timing and schedulability analysis. We focus on independence of pWCET estimates as the main concern in the application of probabilistic scheduling, with particular emphasis on measurement-based probabilistic timing analyses, for which independence across pWCET estimates may not be guaranteed. We relate pWCET (in)dependence to the platform-induced timing dependencies that occur among tasks, and even jobs of the same task. We conclude that independent pWCET distributions can be obtained, even if dependencies exist, by either controlling the measurement protocol, or by deriving distinct pWCET estimates for particular instances of a task.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under grant TIN2015-65316-P, the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 772773) and the HiPEAC Network of Excellence. Jaume Abella and Enrico Mezzetti have been partially supported by MINECO under Ramon y Cajal and Juan de la Cierva-Incorporación postdoctoral fellowships number RYC-2013-14717 and IJCI-2016-27396 respectively.Peer ReviewedPostprint (author's final draft

    Improving time predictability of shared hardware resources in real-time multicore systems : emphasis on the space domain

    Get PDF
    Critical Real-Time Embedded Systems (CRTES) follow a verification and validation process on the timing and functional correctness. This process includes the timing analysis that provides Worst-Case Execution Time (WCET) estimates to provide evidence that the execution time of the system, or parts of it, remain within the deadlines. A key design principle for CRTES is the incremental qualification, whereby each software component can be subject to verification and validation independently of any other component, with obvious benefits for cost. At timing level, this requires time composability, such that the timing behavior of a function is not affected by other functions. CRTES are experiencing an unprecedented growth with rising performance demands that have motivated the use of multicore architectures. Multicores can provide the performance required and bring the potential of integrating several software functions onto the same hardware. However, multicore contention in the access to shared hardware resources creates a dependence of the execution time of a task with the rest of the tasks running simultaneously. This dependence threatens time predictability and jeopardizes time composability. In this thesis we analyze and propose hardware solutions to be applied on current multicore designs for CRTES to improve time predictability and time composability, focusing on the on-chip bus and the memory controller. At hardware level, we propose new bus and memory controller designs that control and mitigate contention between different cores and allow to have time composability by design, also in the context of mixed-criticality systems. At analysis level, we propose contention prediction models that factor the impact of contenders and don¿t need modifications to the hardware. We also propose a set of Performance Monitoring Counters (PMC) that provide evidence about the contention. We give an special emphasis on the Space domain focusing on the Cobham Gaisler NGMP multicore processor, which is currently assessed by the European Space Agency for its future missions.Los Sistemas Críticos Empotrados de Tiempo Real (CRTES) siguen un proceso de verificación y validación para su correctitud funcional y temporal. Este proceso incluye el análisis temporal que proporciona estimaciones de el peor caso del tiempo de ejecución (WCET) para dar evidencia de que el tiempo de ejecución del sistema, o partes de él, permanecen dentro de los límites temporales. Un principio de diseño clave para los CRTES es la cualificación incremental, por la que cada componente de software puede ser verificado y validado independientemente del resto de componentes, con beneficios obvios para el coste. A nivel temporal, esto requiere composabilidad temporal, por la que el comportamiento temporal de una función no se ve afectado por otras funciones. CRTES están experimentando un crecimiento sin precedentes con crecientes demandas de rendimiento que han motivado el uso the arquitecturas multi-núcleo (multicore). Los procesadores multi-núcleo pueden proporcionar el rendimiento requerido y tienen el potencial de integrar varias funcionalidades software en el mismo hardware. A pesar de ello, la interferencia entre los diferentes núcleos que aparece en los recursos compartidos de os procesadores multi núcleo crea una dependencia del tiempo de ejecución de una tarea con el resto de tareas ejecutándose simultáneamente en el procesador. Esta dependencia amenaza la predictabilidad temporal y compromete la composabilidad temporal. En esta tésis analizamos y proponemos soluciones hardware para ser aplicadas en los diseños multi núcleo actuales para CRTES que mejoran la predictabilidad y composabilidad temporal, centrándose en el bus y el controlador de memoria internos al chip. A nivel de hardware, proponemos nuevos diseños de buses y controladores de memoria que controlan y mitigan la interferencia entre los diferentes núcleos y permiten tener composabilidad temporal por diseño, también en el contexto de sistemas de criticalidad mixta. A nivel de análisis, proponemos modelos de predicción de la interferencia que factorizan el impacto de los núcleos y no necesitan modificaciones hardware. También proponemos un conjunto de Contadores de Control del Rendimiento (PMC) que proporcionoan evidencia de la interferencia. En esta tésis, damós especial importancia al dominio espacial, centrándonos en el procesador mutli núcleo Cobham Gaisler NGMP, que está siendo actualmente evaluado por la Agencia Espacial Europea para sus futuras misiones

    Probabilistic Worst-Case Timing Analysis: Taxonomy and Comprehensive Survey

    Get PDF
    "© ACM, 2019. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Computing Surveys, {VOL 52, ISS 1, (February 2019)} https://dl.acm.org/doi/10.1145/3301283"[EN] The unabated increase in the complexity of the hardware and software components of modern embedded real-time systems has given momentum to a host of research in the use of probabilistic and statistical techniques for timing analysis. In the last few years, that front of investigation has yielded a body of scientific literature vast enough to warrant some comprehensive taxonomy of motivations, strategies of application, and directions of research. This survey addresses this very need, singling out the principal techniques in the state of the art of timing analysis that employ probabilistic reasoning at some level, building a taxonomy of them, discussing their relative merit and limitations, and the relations among them. In addition to offering a comprehensive foundation to savvy probabilistic timing analysis, this article also identifies the key challenges to be addressed to consolidate the scientific soundness and industrial viability of this emerging field.This work has also been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2015-65316-P, the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No. 772773), and the HiPEAC Network of Excellence. Jaume Abella was partially supported by the Ministry of Economy and Competitiveness under a Ramon y Cajal postdoctoral fellowship (RYC-2013-14717). Enrico Mezzetti has been partially supported by the Spanish Ministry of Economy and Competitiveness under Juan de la Cierva-Incorporación postdoctoral fellowship No. IJCI-2016-27396.Cazorla, FJ.; Kosmidis, L.; Mezzetti, E.; Hernández Luz, C.; Abella, J.; Vardanega, T. (2019). Probabilistic Worst-Case Timing Analysis: Taxonomy and Comprehensive Survey. ACM Computing Surveys. 52(1):1-35. https://doi.org/10.1145/3301283S13552

    211009

    Get PDF
    Tasks running on microprocessors with cache memories are often subjected to cache related preemption delays (CRPDs). CRPDs may significantly increase task execution times, thereby, affecting their schedulability. Schedulability analysis accounting for the impact of CRPD has been extensively studied over the past two decades for systems with a single level of cache. Yet, the literature on CRPD for multilevel non-inclusive caches is relatively scarce. Two main challenges exist when analyzing multilevel caches: (1) characterization of the indirect effect of preemption, i.e., capturing the increase in cache interference at lower cache levels (e.g., L2 cache) due to the evictions of cache content from a higher cache level (e.g., L1 cache), and (2) upper bounding the maximum CRPD suffered by tasks at lower cache levels (e.g., L2 cache), i.e., determining the cache content of tasks that can be evicted from lower cache levels in case of preemptions. Existing analysis that focus on bounding CRPD for multilevel non-inclusive caches overestimate the values of (1) and (2) leading to pessimistic worst-case response time (WCRT) estimations. In this work, we reduce the excessive pessimism of the state-of-the-art CRPD analysis for multilevel non-inclusive caches by (i) introducing the notion of multi-level useful cache blocks, i.e., cache blocks that can cause CRPD at different cache levels, and use it to compute a tighter bound on the indirect effect of preemption of tasks; and (ii) deriving a new analysis to compute tighter bounds on the CRPD of tasks at lower cache levels (e.g., L2 cache). We performed a thorough experimental evaluation using benchmarks to compare the performance of our proposed CRPD analysis against the state-of-the-art CRPD analysis. Experimental results show that our proposed CRPD analysis dominates the existing analysis and improves task set schedulability by up to 20% percentage pointsThis work was partially supported by National Funds through FCT/MCTES (Portuguese Foundation for Science and Technology), within the CISTER Research Unit (UIDP/UIDB/04234/2020); also by the Operational Competitiveness Programme and Internationalization (COMPETE 2020) under the PT2020 Partnership Agreement, through the European Regional Development Fund (ERDF), and by national funds through the FCT, within project PREFECT (POCI-01-0145-FEDER-029119); also by the European Union’s Horizon 2020 - The EU Framework Programme for Research and Innovation 2014-2020, under grant agreement No. 732505. Project ”TEC4Growth - Pervasive Intelligence, Enhancers and Proofs of Concept with Industrial Impact/NORTE-01- 0145-FEDER000020” financed by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement.info:eu-repo/semantics/publishedVersio
    • …
    corecore