16 research outputs found

    Embedded Program Annotations for WCET Analysis

    Get PDF
    We present __builtin_ais_annot(), a user-friendly, versatile way to transfer annotations (also known as flow facts) written on the source code level to the machine code level. To do so, we couple two tools often used during the development of safety-critical hard real-time systems, the formally verified C compiler CompCert and the static WCET analyzer aiT. CompCert stores the AIS annotations given via __builtin_ais_annot() in a special section of the ELF binary, which can later be extracted automatically by aiT

    Software timing analysis for complex hardware with survivability and risk analysis

    Get PDF
    The increasing automation of safety-critical real-time systems, such as those in cars and planes, leads, to more complex and performance-demanding on-board software and the subsequent adoption of multicores and accelerators. This causes software's execution time dispersion to increase due to variable-latency resources such as caches, NoCs, advanced memory controllers and the like. Statistical analysis has been proposed to model the Worst-Case Execution Time (WCET) of software running such complex systems by providing reliable probabilistic WCET (pWCET) estimates. However, statistical models used so far, which are based on risk analysis, are overly pessimistic by construction. In this paper we prove that statistical survivability and risk analyses are equivalent in terms of tail analysis and, building upon survivability analysis theory, we show that Weibull tail models can be used to estimate pWCET distributions reliably and tightly. In particular, our methodology proves the correctness-by-construction of the approach, and our evaluation provides evidence about the tightness of the pWCET estimates obtained, which allow decreasing them reliably by 40% for a railway case study w.r.t. state-of-the-art exponential tails.This work is a collaboration between Argonne National Laboratory and the Barcelona Supercomputing Center within the Joint Laboratory for Extreme-Scale Computing. This research is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC02- 06CH11357, program manager Laura Biven, and by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    A generic framework to integrate data caches in the WCET analysis of real-time systems

    Get PDF
    Worst-case execution time (WCET) analysis of systems with data caches is one of the key challenges in real-time systems. Caches exploit the inherent reuse properties of programs by temporarily storing certain memory contents near the processor, in order that further accesses to such contents do not require costly memory transfers. Current worst-case data cache analysis methods focus on specific cache organizations (set-associative LRU, locked, ACDC, etc.), most of the times adapting techniques designed to analyze instruction caches. On the other hand, there are methodologies to analyze the data reuse of a program, independently of the data cache. In this paper we propose a generic WCET analysis framework to analyze data caches taking profit of such reuse information. It includes the categorization of data references and their integration in an IPET model. We apply it to a conventional LRU cache, an ACDC, and other baseline systems, and compare them using the TACLeBench benchmark suite. Our results show that persistence-based LRU analyses dismiss essential information on data, and a reuse-based analysis improves the WCET bound around 17% in average. In general, the best WCET estimations are obtained with optimization level 2, where the ACDC cache performs 39% better than a set-associative LRU

    Programming languages and static analysis techniques for software energy certification

    Get PDF
    Mobile devices, such as smartphones, have become an extension of our body for various reasons, mainly because of mobility, communication convenience, and an extensive range of provided apps. However, being portable, they have several limitations, and the most concerning for users is the limited battery life. Smartphone manufacturers are trying to address this problem by optimizing the hardware and the software, but the concern remains. Although it is well known that apps are one of the components of smartphones that consume the most energy, most app developers do not use or even consider using strategies to minimize their apps’ energy consumption. Creating a labeling system that classifies apps based on their energy consumption is a possible solution to drive developers to be more conscious of their apps’ energy consumption. This work aims to develop a technique to compute a metric for app energy certification. The metric we propose to calculate is WCEC, which represents the energy consumed in the most extreme case of program execution. Typically, this analysis is used in embedded systems, where the energy consumed by the apps must be rigorously determined to avoid inconveniences. Here we have reused the fundamentals of the WCEC analysis to use for Android apps. We address our solution to the Android platform since it has the largest worldwide mobile operating system market share. To perform the WCEC analysis, we take advantage of static analysis and the IPET, the techniques commonly used in this context. We have also created a test scenario to illustrate a case where our tool can be used. This document aims to explain the fundamentals present in our tool and describe its implementation details.Os dispositivos móveis, particularmente os smartphones, tornaram-se uma extensão dos nossos corpos por uma variedade de razões, principalmente mobilidade, conveniência de comunicação e uma vasta gama de aplicações disponíveis. No entanto, várias limitações sugerem pela portabilidade destes dispositivos. A limitação comummente apontada pelos utilizadores é a duração limitada da bateria. Os fabricantes de smartphones tentam resolver este problema através da otimização do hardware e software, mas o problema persiste. É bem-sabido que as aplicações são um dos elementos que consomem mais energia nos smartphones, no entanto, a maioria dos desenvolvedores de aplicações não utiliza ou sequer considera a utilização de estratégias para minimizar o consumo de energia das suas aplicações. Uma forma prática para resolver este problema é forçar os desenvolvedores a se preocuparem mais sobre o consumo de energia das suas aplicações, criando um sistema de catálogo que classifica as aplicações com base no seu consumo de energia. Neste trabalho o nosso objetivo é desenvolver uma ferramenta capaz de fornecer uma métrica que poderá ser usado para fins de comparação entre as aplicações. A métrica que nos propomos em calcular é o Worst-Case Energy Consumption (WCEC), que representa a energia consumida no caso mais extremo da execução de um programa. Esta abordagem é normalmente empregue em alguns sistemas embutidos, nos quais a duração da bateria deve ser rigorosamente calculada para evitar vários constrangimentos. Também ilustramos um cenário para demonstrar como a nossa ferramenta pode ser utilizada. O objetivo deste documento é explicar os princípios usados na nossa abordagem e descrever em pormenor os passos usados na implementação da nossa ferramenta

    ePAPI: Performance Application Programming Interface for Embedded Platforms

    Get PDF
    Performance Monitoring Counters (PMCs) have been traditionally used in the mainstream computing domain to perform debugging and optimization of software performance. PMCs are increasingly considered in embedded time-critical domains to collect in-depth information, e.g. cache misses and memory accesses, of software execution time on complex multicore platforms. In main-stream platforms, standardized specifications and applications like the Performance Application Programming Interface (PAPI) and perf have been proposed to deal with variable PMC support across platforms, by providing a shared interface for configuring and collecting traceable events. However, no equivalent solution exists for embedded critical processors for which the user is required to deal with low-level, platform-specific, and error-prone manipulation of PMC registers. In this paper, we address the need for a standardized PMC interface in the embedded domain, especially in view to support timing characterization of embedded platforms. We assess the compatibility of the PAPI interface with the PMC support available on the AURIX TC297, a reference automotive platform, and we implement and validate ePAPI, the first functionally-equivalent and low-overhead implementation of PAPI for the considered embedded platform

    Impact of DM-LRU on WCET: A Static Analysis Approach

    Get PDF
    Cache memories in modern embedded processors are known to improve average memory access performance. Unfortunately, they are also known to represent a major source of unpredictability for hard real-time workload. One of the main limitations of typical caches is that content selection and replacement is entirely performed in hardware. As such, it is hard to control the cache behavior in software to favor caching of blocks that are known to have an impact on an application\u27s worst-case execution time (WCET). In this paper, we consider a cache replacement policy, namely DM-LRU, that allows system designers to prioritize caching of memory blocks that are known to have an important impact on an application\u27s WCET. Considering a single-core, single-level cache hierarchy, we describe an abstract interpretation-based timing analysis for DM-LRU. We implement the proposed analysis in a self-contained toolkit and study its qualitative properties on a set of representative benchmarks. Apart from being useful to compute the WCET when DM-LRU or similar policies are used, the proposed analysis can allow designers to perform WCET impact-aware selection of content to be retained in cache

    Timing in Technischen Sicherheitsanforderungen für Systementwürfe mit heterogenen Kritikalitätsanforderungen

    Get PDF
    Traditionally, timing requirements as (technical) safety requirements have been avoided through clever functional designs. New vehicle automation concepts and other applications, however, make this harder or even impossible and challenge design automation for cyber-physical systems to provide a solution. This thesis takes upon this challenge by introducing cross-layer dependency analysis to relate timing dependencies in the bounded execution time (BET) model to the functional model of the artifact. In doing so, the analysis is able to reveal where timing dependencies may violate freedom from interference requirements on the functional layer and other intermediate model layers. For design automation this leaves the challenge how such dependencies are avoided or at least be bounded such that the design is feasible: The results are synthesis strategies for implementation requirements and a system-level placement strategy for run-time measures to avoid potentially catastrophic consequences of timing dependencies which are not eliminated from the design. Their applicability is shown in experiments and case studies. However, all the proposed run-time measures as well as very strict implementation requirements become ever more expensive in terms of design effort for contemporary embedded systems, due to the system's complexity. Hence, the second part of this thesis reflects on the design aspect rather than the analysis aspect of embedded systems and proposes a timing predictable design paradigm based on System-Level Logical Execution Time (SL-LET). Leveraging a timing-design model in SL-LET the proposed methods from the first part can now be applied to improve the quality of a design -- timing error handling can now be separated from the run-time methods and from the implementation requirements intended to guarantee them. The thesis therefore introduces timing diversity as a timing-predictable execution theme that handles timing errors without having to deal with them in the implemented application. An automotive 3D-perception case study demonstrates the applicability of timing diversity to ensure predictable end-to-end timing while masking certain types of timing errors.Traditionell wurden Timing-Anforderungen als (technische) Sicherheitsanforderungen durch geschickte funktionale Entwürfe vermieden. Neue Fahrzeugautomatisierungskonzepte und Anwendungen machen dies jedoch schwieriger oder gar unmöglich; Aufgrund der Problemkomplexität erfordert dies eine Entwurfsautomatisierung für cyber-physische Systeme heraus. Diese Arbeit nimmt sich dieser Herausforderung an, indem sie eine schichtenübergreifende Abhängigkeitsanalyse einführt, um zeitliche Abhängigkeiten im Modell der beschränkten Ausführungszeit (BET) mit dem funktionalen Modell des Artefakts in Beziehung zu setzen. Auf diese Weise ist die Analyse in der Lage, aufzuzeigen, wo Timing-Abhängigkeiten die Anforderungen an die Störungsfreiheit auf der funktionalen Schicht und anderen dazwischenliegenden Modellschichten verletzen können. Für die Entwurfsautomatisierung ergibt sich daraus die Herausforderung, wie solche Abhängigkeiten vermieden oder zumindest so eingegrenzt werden können, dass der Entwurf machbar ist: Das Ergebnis sind Synthesestrategien für Implementierungsanforderungen und eine Platzierungsstrategie auf Systemebene für Laufzeitmaßnahmen zur Vermeidung potentiell katastrophaler Folgen von Timing-Abhängigkeiten, die nicht aus dem Entwurf eliminiert werden. Ihre Anwendbarkeit wird in Experimenten und Fallstudien gezeigt. Allerdings werden alle vorgeschlagenen Laufzeitmaßnahmen sowie sehr strenge Implementierungsanforderungen für moderne eingebettete Systeme aufgrund der Komplexität des Systems immer teurer im Entwurfsaufwand. Daher befasst sich der zweite Teil dieser Arbeit eher mit dem Entwurfsaspekt als mit dem Analyseaspekt von eingebetteten Systemen und schlägt ein Entwurfsparadigma für vorhersagbares Timing vor, das auf der System-Level Logical Execution Time (SL-LET) basiert. Basierend auf einem Timing-Entwurfsmodell in SL-LET können die vorgeschlagenen Methoden aus dem ersten Teil nun angewandt werden, um die Qualität eines Entwurfs zu verbessern -- die Behandlung von Timing-Fehlern kann nun von den Laufzeitmethoden und von den Implementierungsanforderungen, die diese garantieren sollen, getrennt werden. In dieser Arbeit wird daher Timing Diversity als ein Thema der Timing-Vorhersage in der Ausführung eingeführt, das Timing-Fehler behandelt, ohne dass sie in der implementierten Anwendung behandelt werden müssen. Anhand einer Fallstudie aus dem Automobilbereich (3D-Umfeldwahrnehmung) wird die Anwendbarkeit von Timing-Diversität demonstriert, um ein vorhersagbares Ende-zu-Ende-Timing zu gewährleisten und gleichzeitig in der Lage zu sein, bestimmte Arten von Timing-Fehlern zu maskieren

    Efficient Adaptive Hard Real-time Multi-processor Systems

    Get PDF
    Modern computing systems are based on multi-processor systems, i.e. multiple cores on the same chip. Hard real-time systems are required to perform particular tasks within certain amount of time; failure to do so characterises an unaccepted behavior. Hard real-time systems are found in safety-critical applications, e.g. airbag control software, flight control software, etc. In safety-critical applications, failure to meet the real-time constraints can have catastrophic effects. The safe and, at the same time, efficient deployment of applications, with hard real-time constraints, on multi-processors is a challenging task. Scheduling methods and Models of Computation, that provide safe deployments, require a realistic estimation of the Worst-Case Execution Time (WCET) of tasks. The simultaneous access of shared resources by parallel tasks, causes interference delays due to hardware arbitration. Interference delays can be accounted for, with the pessimistic assumption that all possible interference can happen. The resulting schedules would be exceedingly conservative, thus the benefits of multi-processor would be significantly negated. Producing less pessimistic schedules is challenging due to the inter-dependency between WCET estimation and deployment optimisation. Accurate estimation of interference delays -and thus estimation of task WCET- depends on the way an application is deployed; deployment is an optimisation problem that depends on the estimation of task WCET. Another efficiency gap, which is of consequence in several systems (e.g. airbag control), stems from the fact that rarely tasks execute with their WCET. Safe runtime adaptation based on the Actual Execution Times, can yield additional improvements in terms of latency (more responsive systems). To achieve efficiency and retain adaptability, we propose that interference analysis should be coupled with the deployment process. The proposed interference analysis method estimates the possible amount of interference, based on an architecture and an application model. As more information is provided, such as scheduling, memory mapping, etc, the per-task interference estimation becomes more accurate. Thus, the method computes interference-sensitive WCET estimations (isWCET). Based on the isWCET method, we propose a method to break the inter-dependency between WCET estimation and deployment optimisation. Initially, the isWCETs are over-approximated, by assuming worst-case interference, and a safe deployment is derived. Subsequently, the proposed method computes accurate isWCETs by spatio-temporal exclusion, i.e. excluding interferences from non-overlapping tasks that share resources (space). Based on accurate isWCETs, the deployment solution is improved to provide better latency guarantees. We also propose a distributed runtime adaptation technique, that aims to improve run-time latency. Using isWCET estimations restricts the possible adaptations, as an adaptation might increase the interference and violate the safety guarantees. The proposed technique introduces statically scheduling dependencies between tasks that prevent additional interference. At runtime, a self-timed scheduling policy that respects these dependencies, is applied, proven to be safe, and with minimal overhead. Experimental evaluation on Kalray MPPA-256 shows that our methods improve isWCET up to 36%, guaranteed latency up to 46%, runtime performance up to 42%, with a consolidated performance gain of 50%

    Design and implementation of WCET analyses : including a case study on multi-core processors with shared buses

    Get PDF
    For safety-critical real-time embedded systems, the worst-case execution time (WCET) analysis — determining an upper bound on the possible execution times of a program — is an important part of the system verification. Multi-core processors share resources (e.g. buses and caches) between multiple processor cores and, thus, complicate the WCET analysis as the execution times of a program executed on one processor core significantly depend on the programs executed in parallel on the concurrent cores. We refer to this phenomenon as shared-resource interference. This thesis proposes a novel way of modeling shared-resource interference during WCET analysis. It enables an efficient analysis — as it only considers one processor core at a time — and it is sound for hardware platforms exhibiting timing anomalies. Moreover, this thesis demonstrates how to realize a timing-compositional verification on top of the proposed modeling scheme. In this way, this thesis closes the gap between modern hardware platforms, which exhibit timing anomalies, and existing schedulability analyses, which rely on timing compositionality. In addition, this thesis proposes a novel method for calculating an upper bound on the amount of interference that a given processor core can generate in any time interval of at most a given length. Our experiments demonstrate that the novel method is more precise than existing methods.Die Analyse der maximalen Ausführungszeit (Worst-Case-Execution-Time-Analyse, WCET-Analyse) ist für eingebettete Echtzeit-Computer-Systeme in sicherheitskritischen Anwendungsbereichen unerlässlich. Mehrkernprozessoren erschweren die WCET-Analyse, da einige ihrer Hardware-Komponenten von mehreren Prozessorkernen gemeinsam genutzt werden und die Ausführungszeit eines Programmes somit vom Verhalten mehrerer Kerne abhängt. Wir bezeichnen dies als Interferenz durch gemeinsam genutzte Komponenten. Die vorliegende Arbeit schlägt eine neuartige Modellierung dieser Interferenz während der WCET-Analyse vor. Der vorgestellte Ansatz ist effizient und führt auch für Computer-Systeme mit Zeitanomalien zu korrekten Ergebnissen. Darüber hinaus zeigt diese Arbeit, wie ein zeitkompositionales Verfahren auf Basis der vorgestellten Modellierung umgesetzt werden kann. Auf diese Weise schließt diese Arbeit die Lücke zwischen modernen Mikroarchitekturen, die Zeitanomalien aufweisen, und den existierenden Planbarkeitsanalysen, die sich alle auf die Kompositionalität des Zeitverhaltens verlassen. Außerdem stellt die vorliegende Arbeit ein neues Verfahren zur Berechnung einer oberen Schranke der Menge an Interferenz vor, die ein bestimmter Prozessorkern in einem beliebigen Zeitintervall einer gegebenen Länge höchstens erzeugen kann. Unsere Experimente zeigen, dass das vorgestellte Berechnungsverfahren präziser ist als die existierenden Verfahren.Deutsche Forschungsgemeinschaft (DFG) as part of the Transregional Collaborative Research Centre SFB/TR 14 (AVACS

    Automated Compilation Framework for Scratchpad-based Real-Time Systems

    Get PDF
    ScratchPad Memory (SPM) is highly adopted in real-time systems as it exhibits a predictable behaviour. SPM is software-managed by explicitly inserting instructions to move code and data transfers between the SPM and the main memory. However, it is a tedious job to decide how to manage the SPM and to manually modify the code to insert memory transfers. Hence, an automated compilation tool is essential to efficiently utilize the SPM. Another key problem with SPM is the latency suffered by the system due to memory transfers. Hiding this latency is important for high-performance systems. In this thesis, we address the problems of managing SPM and reducing the impact of memory latency. To realize the automation of our work, we develop a compilation framework based on the LLVM compiler to analyze and transform the program code. We exploit our framework to improve the performance of the execution of single and multi-tasks in real-time systems. For the single task execution, Worst-Case Execution Time (WCET) is of great importance to assure correct and safe behaviour of the system. So, we propose a WCET-driven allocation technique for data SPM that employs software prefetching to efficiently manage the SPM and to overlap the memory transfer and the task execution in a predictable way. On the other hand, multi-tasking requires the system to be schedulable such that all the tasks can meet their timing requirements. However, executing multiple tasks on a multi-processor platform suffers from the contention of the accesses to the shared main memory. To avoid the contention, several scheduling techniques adopted the 3-phase execution model which executes the task as a sequence of memory and computation phases. This provides the means to avoid the contention as well as to hide the memory latency by using a Direct Memory Access (DMA) engine. Executing memory transfers using the DMA allows overlapping the memory transfers with the computations on the processor. Using the 3-phase model in systems with limited sizes of local SPM may necessitate a segmentation of the task. Automating the segmentation process is necessary especially for systems with large task sets. Hence, we propose a set of efficient segmentation algorithms that follow the 3-phase execution model. The application of these algorithms shows a significant improvement in the system schedulability. For our segmentation algorithms to be more applicable, we extend the 3-phase model to allow programs with multiple paths represented as conditional Directed Acyclic Graphs (DAGs), unlike the previous works that targeted sequential programs. We also introduce a multi-steaming model to exploit the benefits of prefetching by overlapping the memory and computation phases of the same task, which was not allowed in the previous approaches. By combining the automated compilation with the proposed algorithms, we are able to achieve our goal to efficiently manage data SPM in real-time systems
    corecore