9 research outputs found

    Adapting TDMA arbitration for measurement-based probabilistic timing analysis

    Get PDF
    Critical Real-Time Embedded Systems require functional and timing validation to prove that they will perform their functionalities correctly and in time. For timing validation, a bound to the Worst-Case Execution Time (WCET) for each task is derived and passed as an input to the scheduling algorithm to ensure that tasks execute timely. Bounds to WCET can be derived with deterministic timing analysis (DTA) and probabilistic timing analysis (PTA), each of which relies upon certain predictability properties coming from the hardware/software platform beneath. In particular, specific hardware designs are needed for both DTA and PTA, which challenges their adoption by hardware vendors. This paper makes a step towards reconciling the hardware needs of DTA and PTA timing analyses to increase the likelihood of those hardware designs to be adopted by hardware vendors. In particular, we show how Time Division Multiple Access (TDMA), which has been regarded as one of the main DTA-compliant arbitration policies, can be used in the context of PTA and, in particular, of the industrially-friendly Measurement-Based PTA (MBPTA). We show how the execution time measurements taken as input for MBPTA need to be padded to obtain reliable and tight WCET estimates on top of TDMA-arbitrated hardware resources with no further hardware support. Our results show that TDMA delivers tighter WCET estimates than MBPTA-friendly arbitration policies, whereas MBPTA-friendly policies provide higher average performance. Thus, the best policy to choose depends on the particular needs of the end user.The research leading to these results has been funded by the EU FP7 under grant agreement no. 611085 (PROXIMA) and 287519 (parMERASA). This work has also been partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under grant TIN2015-65316-P and the HiPEAC Network of Excellence. Miloˇs Pani´c is funded by the Spanish Ministry of Education under the FPU grant FPU12/05966. Jaume Abella has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft

    Using Trace Scratchpads to Reduce Execution Times in Predictable Real-Time Architectures

    Full text link

    Análise de valor para determinação do tempo de execução no pior caso (WCET) de tarefas em sistemas de tempo real

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, 2015A utilização de sistemas computacionais na sociedade tem se expandido e as aplicações com requisitos de tempo real são mais comuns, variando em relação à complexidade e às necessidades de garantia no atendimento de restrições temporais (deadlines). Uma propriedade importante na definição do comportamento temporal de uma tarefa é o tempo de computação, que é o tempo necessário para a execução completada tarefa. Um dos grandes problemas de obtê-lo está ligado à análise da microarquitetura do processador. Considerando um processador que possui memória de dados com latência variada, é necessário a análise de valor para identificar a região de memória que a instrução acessa (memória principal ou ScrathPad Memory), para que o pior tempo de execução dos programas não seja consideravelmente superestimado. O objetivo deste trabalho é usar a análise de valor para determinar o tempo correto de acesso à memória, através da identificação da região de memória que cada instrução acessa, com a finalidade de obter um limite superior do WCET menos pessimista.Abstract: The use of computer systems in our society has expanded and applications with real-time requirements are getting more usual, varying in relation to the complexity and the necessity of guaranting deadlines. An important restriction in defining the temporal behavior of a task is the computation time, i.e., the time necessary to complete the task. Amajor problem in obtaining WCET is the processor microarchitecture analysis. Considering a processor with a data memory that has varying latency, value analysis is necessary to identify the memory region tha teach instruction accesses (main memory or ScrathPad Memory), so the worst execution time of programs are not considerably overestimated.The objective of this work is to use value analysis to obtain the correct memory access time by identifying the region of memory each instruction accesses, obtaining WCET upper bounds that are less pessimistic

    Principles of timing anomalies in superscalar processors

    Get PDF
    “This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.”The counter-intuitive timing behavior of certain features in superscalar processors that cause severe problems for existing worst-case execution time analysis (WCET) methods is called timing anomalies. In this paper, we identify structural sources potentially causing timing anomalies in superscalar pipelines. We provide examples for cases where timing anomalies can arise in much simpler hardware architectures than commonly supposed (i.e., even in hardware containing only in-orderfunctional units). We elaborate the general principle behind timing anomalies and propose a general criterion (resource allocation criterion) that provides a necessary (but not sufficient) condition for the occurrence of timing anomalies in a processor. This principle allows to state the absence of timing anomalies for a specific combination of hardware and software and thus forms a solid theoretic foundation for the time-predictable execution of real-time software on complex processor hardware

    A time-predictable many-core processor design for critical real-time embedded systems

    Get PDF
    Critical Real-Time Embedded Systems (CRTES) are in charge of controlling fundamental parts of embedded system, e.g. energy harvesting solar panels in satellites, steering and breaking in cars, or flight management systems in airplanes. To do so, CRTES require strong evidence of correct functional and timing behavior. The former guarantees that the system operates correctly in response of its inputs; the latter ensures that its operations are performed within a predefined time budget. CRTES aim at increasing the number and complexity of functions. Examples include the incorporation of \smarter" Advanced Driver Assistance System (ADAS) functionality in modern cars or advanced collision avoidance systems in Unmanned Aerial Vehicles (UAVs). All these new features, implemented in software, lead to an exponential growth in both performance requirements and software development complexity. Furthermore, there is a strong need to integrate multiple functions into the same computing platform to reduce the number of processing units, mass and space requirements, etc. Overall, there is a clear need to increase the computing power of current CRTES in order to support new sophisticated and complex functionality, and integrate multiple systems into a single platform. The use of multi- and many-core processor architectures is increasingly seen in the CRTES industry as the solution to cope with the performance demand and cost constraints of future CRTES. Many-cores supply higher performance by exploiting the parallelism of applications while providing a better performance per watt as cores are maintained simpler with respect to complex single-core processors. Moreover, the parallelization capabilities allow scheduling multiple functions into the same processor, maximizing the hardware utilization. However, the use of multi- and many-cores in CRTES also brings a number of challenges related to provide evidence about the correct operation of the system, especially in the timing domain. Hence, despite the advantages of many-cores and the fact that they are nowadays a reality in the embedded domain (e.g. Kalray MPPA, Freescale NXP P4080, TI Keystone II), their use in CRTES still requires finding efficient ways of providing reliable evidence about the correct operation of the system. This thesis investigates the use of many-core processors in CRTES as a means to satisfy performance demands of future complex applications while providing the necessary timing guarantees. To do so, this thesis contributes to advance the state-of-the-art towards the exploitation of parallel capabilities of many-cores in CRTES contributing in two different computing domains. From the hardware domain, this thesis proposes new many-core designs that enable deriving reliable and tight timing guarantees. From the software domain, we present efficient scheduling and timing analysis techniques to exploit the parallelization capabilities of many-core architectures and to derive tight and trustworthy Worst-Case Execution Time (WCET) estimates of CRTES.Los sistemas críticos empotrados de tiempo real (en ingles Critical Real-Time Embedded Systems, CRTES) se encargan de controlar partes fundamentales de los sistemas integrados, e.g. obtención de la energía de los paneles solares en satélites, la dirección y frenado en automóviles, o el control de vuelo en aviones. Para hacerlo, CRTES requieren fuerte evidencias del correcto comportamiento funcional y temporal. El primero garantiza que el sistema funciona correctamente en respuesta de sus entradas; el último asegura que sus operaciones se realizan dentro de unos limites temporales establecidos previamente. El objetivo de los CRTES es aumentar el número y la complejidad de las funciones. Algunos ejemplos incluyen los sistemas inteligentes de asistencia a la conducción en automóviles modernos o los sistemas avanzados de prevención de colisiones en vehiculos aereos no tripulados. Todas estas nuevas características, implementadas en software,conducen a un crecimiento exponencial tanto en los requerimientos de rendimiento como en la complejidad de desarrollo de software. Además, existe una gran necesidad de integrar múltiples funciones en una sóla plataforma para así reducir el número de unidades de procesamiento, cumplir con requisitos de peso y espacio, etc. En general, hay una clara necesidad de aumentar la potencia de cómputo de los actuales CRTES para soportar nueva funcionalidades sofisticadas y complejas e integrar múltiples sistemas en una sola plataforma. El uso de arquitecturas multi- y many-core se ve cada vez más en la industria CRTES como la solución para hacer frente a la demanda de mayor rendimiento y las limitaciones de costes de los futuros CRTES. Las arquitecturas many-core proporcionan un mayor rendimiento explotando el paralelismo de aplicaciones al tiempo que proporciona un mejor rendimiento por vatio ya que los cores se mantienen más simples con respecto a complejos procesadores de un solo core. Además, las capacidades de paralelización permiten programar múltiples funciones en el mismo procesador, maximizando la utilización del hardware. Sin embargo, el uso de multi- y many-core en CRTES también acarrea ciertos desafíos relacionados con la aportación de evidencias sobre el correcto funcionamiento del sistema, especialmente en el ámbito temporal. Por eso, a pesar de las ventajas de los procesadores many-core y del hecho de que éstos son una realidad en los sitemas integrados (por ejemplo Kalray MPPA, Freescale NXP P4080, TI Keystone II), su uso en CRTES aún precisa de la búsqueda de métodos eficientes para proveer evidencias fiables sobre el correcto funcionamiento del sistema. Esta tesis ahonda en el uso de procesadores many-core en CRTES como un medio para satisfacer los requisitos de rendimiento de aplicaciones complejas mientras proveen las garantías de tiempo necesarias. Para ello, esta tesis contribuye en el avance del estado del arte hacia la explotación de many-cores en CRTES en dos ámbitos de la computación. En el ámbito del hardware, esta tesis propone nuevos diseños many-core que posibilitan garantías de tiempo fiables y precisas. En el ámbito del software, la tesis presenta técnicas eficientes para la planificación de tareas y el análisis de tiempo para aprovechar las capacidades de paralelización en arquitecturas many-core, y también para derivar estimaciones de peor tiempo de ejecución (Worst-Case Execution Time, WCET) fiables y precisas

    On static execution-time analysis

    Get PDF
    Proving timeliness is an integral part of the verification of safety-critical real-time systems. To this end, timing analysis computes upper bounds on the execution times of programs that execute on a given hardware platform. Modern hardware platforms commonly exhibit counter-intuitive timing behaviour: a locally slower execution can lead to a faster overall execution. Such behaviour challenges efficient timing analysis. In this work, we present and discuss a hardware design, the strictly in-order pipeline, that behaves monotonically w.r.t. the progress of a program's execution. Based on monotonicity, we prove the absence of the aforementioned counter-intuitive behaviour. At least since multi-core processors have emerged, timing analysis separates concerns by analysing different aspects of the system's timing behaviour individually. In this work, we validate the underlying assumption that a timing bound can be soundly composed from individual contributions. We show that even simple processors exhibit counter-intuitive behaviour - a locally slow execution can lead to an even slower overall execution - that impedes the soundness of the composition. We present the compositional base bound analysis that accounts for any such amplifying effects within its timing contribution. This enables a sound compositional analysis even for complex processors. Furthermore, we discuss hardware modifications that enable efficient compositional analyses.Echtzeitsysteme müssen unter allen Umständen beweisbar pünktlich arbeiten. Zum Beweis errechnet die Zeitanalyse obere Schranken der für die Ausführung von Programmen auf einer Hardware-Plattform benötigten Zeit. Moderne Hardware-Plattformen sind bekannt für unerwartetes Zeitverhalten bei dem eine lokale Verzögerung in einer global schnelleren Ausführung resultiert. Solches Zeitverhalten erschwert eine effiziente Analyse. Im Rahmen dieser Arbeit diskutieren wir das Design eines Prozessors mit eingeschränkter Fließbandverarbeitung (strictly in-order pipeline), der sich bzgl. des Fortschritts einer Programmausführung monoton verhält. Wir beweisen, dass Monotonie das oben genannte unerwartete Zeitverhalten verhindert. Spätestens seit dem Einsatz von Mehrkernprozessoren besteht die Zeitanalyse aus einzelnen Teilanalysen welche nur bestimmte Aspekte des Zeitverhaltens betrachten. Eine zentrale Annahme ist hierbei, dass sich die Teilergebnisse zu einer korrekten Zeitschranke zusammensetzen lassen. Im Rahmen dieser Arbeit zeigen wir, dass diese Annahme selbst für einfache Prozessoren ungültig ist, da eine lokale Verzögerung zu einer noch größeren globalen Verzögerung führen kann. Für bestehende Prozessoren entwickeln wir eine neuartige Teilanalyse, die solche verstärkenden Effekte berücksichtigt und somit eine korrekte Komposition von Teilergebnissen erlaubt. Für zukünftige Prozessoren beschreiben wir Modifikationen, die eine deutlich effizientere Zeitanalyse ermöglichen

    Program Analysis as Model Checking

    Get PDF

    Enabling caches in probabilistic timing analysis

    Get PDF
    Hardware and software complexity of future critical real-time systems challenges the scalability of traditional timing analysis methods. Measurement-Based Probabilistic Timing Analysis (MBPTA) has recently emerged as an industrially-viable alternative technique to deal with complex hardware/software. Yet, MBPTA requires certain timing properties in the system under analysis that are not satisfied in conventional systems. In this thesis, we introduce, for the first time, hardware and software solutions to satisfy those requirements as well as to improve MBPTA applicability. We focus on one of the hardware resources with highest impact on both average performance and Worst-Case Execution Time (WCET) in current real-time platforms, the cache. In this line, the contributions of this thesis follow three different axes: hardware solutions and software solutions to enable MBPTA, and MBPTA analysis enhancements in systems featuring caches. At hardware level, we set the foundations of MBPTA-compliant processor designs, and define efficient time-randomised cache designs for single- and multi-level hierarchies of arbitrary complexity, including unified caches, which can be time-analysed for the first time. We propose three new software randomisation approaches (one dynamic and two static variants) to control, in an MBPTA-compliant manner, the cache jitter in Commercial off-the-shelf (COTS) processors in real-time systems. To that end, all variants randomly vary the location of programs' code and data in memory across runs, to achieve probabilistic timing properties similar to those achieved with customised hardware cache designs. We propose a novel method to estimate the WCET of a program using MBPTA, without requiring the end-user to identify worst-case paths and inputs, improving its applicability in industry. We also introduce Probabilistic Timing Composability, which allows Integrated Systems to reduce their WCET in the presence of time-randomised caches. With the above contributions, this thesis pushes the limits in the use of complex real-time embedded processor designs equipped with caches and paves the way towards the industrialisation of MBPTA technology.La complejidad de hardware y software de los sistemas críticos del futuro desafía la escalabilidad de los métodos tradicionales de análisis temporal. El análisis temporal probabilístico basado en medidas (MBPTA) ha aparecido últimamente como una solución viable alternativa para la industria, para manejar hardware/software complejo. Sin embargo, MBPTA requiere ciertas propiedades de tiempo en el sistema bajo análisis que no satisfacen los sistemas convencionales. En esta tesis introducimos, por primera vez, soluciones hardware y software para satisfacer estos requisitos como también mejorar la aplicabilidad de MBPTA. Nos centramos en uno de los recursos hardware con el máximo impacto en el rendimiento medio y el peor caso del tiempo de ejecución (WCET) en plataformas actuales de tiempo real, la cache. En esta línea, las contribuciones de esta tesis siguen 3 ejes distintos: soluciones hardware y soluciones software para habilitar MBPTA, y mejoras de el análisis MBPTA en sistemas usado caches. A nivel de hardware, creamos las bases del diseño de un procesador compatible con MBPTA, y definimos diseños de cache con tiempo aleatorio para jerarquías de memoria con uno y múltiples niveles de cualquier complejidad, incluso caches unificadas, las cuales pueden ser analizadas temporalmente por primera vez. Proponemos tres nuevos enfoques de aleatorización de software (uno dinámico y dos variedades estáticas) para manejar, en una manera compatible con MBPTA, la variabilidad del tiempo (jitter) de la cache en procesadores comerciales comunes en el mercado (COTS) en sistemas de tiempo real. Por eso, todas nuestras propuestas varían aleatoriamente la posición del código y de los datos del programa en la memoria entre ejecuciones del mismo, para conseguir propiedades de tiempo aleatorias, similares a las logradas con diseños hardware personalizados. Proponemos un nuevo método para estimar el WCET de un programa usando MBPTA, sin requerir que el usuario dentifique los caminos y las entradas de programa del peor caso, mejorando así la aplicabilidad de MBPTA en la industria. Además, introducimos la composabilidad de tiempo probabilística, que permite a los sistemas integrados reducir su WCET cuando usan caches de tiempo aleatorio. Con estas contribuciones, esta tesis empuja los limites en el uso de diseños complejos de procesadores empotrados en sistemas de tiempo real equipados con caches y prepara el terreno para la industrialización de la tecnología MBPTA
    corecore