325 research outputs found

    Toward Contention Analysis for Parallel Executing Real-Time Tasks

    Get PDF
    In measurement-based probabilistic timing analysis, the execution conditions imposed to tasks as measurement scenarios, have a strong impact to the worst-case execution time estimates. The scenarios and their effects on the task execution behavior have to be deeply investigated. The aim has to be to identify and to guarantee the scenarios that lead to the maximum measurements, i.e. the worst-case scenarios, and use them to assure the worst-case execution time estimates. We propose a contention analysis in order to identify the worst contentions that a task can suffer from concurrent executions. The work focuses on the interferences on shared resources (cache memories and memory buses) from parallel executions in multi-core real-time systems. Our approach consists of searching for possible task contenders for parallel executions, modeling their contentiousness, and classifying the measurement scenarios accordingly. We identify the most contentious ones and their worst-case effects on task execution times. The measurement-based probabilistic timing analysis is then used to verify the analysis proposed, qualify the scenarios with contentiousness, and compare them. A parallel execution simulator for multi-core real-time system is developed and used for validating our framework. The framework applies heuristics and assumptions that simplify the system behavior. It represents a first step for developing a complete approach which would be able to guarantee the worst-case behavior

    A confidence assessment of WCET estimates for software time randomized caches

    Get PDF
    Obtaining Worst-Case Execution Time (WCET) estimates is a required step in real-time embedded systems during software verification. Measurement-Based Probabilistic Timing Analysis (MBPTA) aims at obtaining WCET estimates for industrial-size software running upon hardware platforms comprising high-performance features. MBPTA relies on the randomization of timing behavior (functional behavior is left unchanged) of hard-to-predict events like the location of objects in memory — and hence their associated cache behavior — that significantly impact software's WCET estimates. Software time-randomized caches (sTRc) have been recently proposed to enable MBPTA on top of Commercial off-the-shelf (COTS) caches (e.g. modulo placement). However, some random events may challenge MBPTA reliability on top of sTRc. In this paper, for sTRc and programs with homogeneously accessed addresses, we determine whether the number of observations taken at analysis, as part of the normal MBPTA application process, captures the cache events significantly impacting execution time and WCET. If this is not the case, our techniques provide the user with the number of extra runs to perform to guarantee that cache events are captured for a reliable application of MBPTA. Our techniques are evaluated with synthetic benchmarks and an avionics application.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] under the PROXIMA Project (www.proxima-project.eu), grant agreement no 611085. This work has also been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2015-65316, the HiPEAC Network of Excellence, and COST Action IC1202: Timing Analysis On Code-Level (TACLe). Jaume Abella has been partially supported by the Ministry of Economy and Competitiveness under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft

    Fast, predictable and low energy memory references through architecture-aware compilation

    Get PDF
    The design of future high-performance embedded systems is hampered by two problems: First, the required hardware needs more energy than is available from batteries. Second, current cache-based approaches for bridging the increasing speed gap between processors and memories cannot guarantee predictable real-time behavior. A contribution to solving both problems is made in this paper which describes a comprehensive set of algorithms that can be applied at design time in order to maximally exploit scratch pad memories (SPMs). We show that both the energy consumption as well as the computed worst case execution time (WCET) can be reduced by up to to 80% and 48%, respectively, by establishing a strong link between the memory architecture and the compiler

    WCET-Directed Dynamic Scratchpad Memory Allocation of Data

    Full text link
    Many embedded systems feature processors coupled with a small and fast scratchpad memory. To the difference with caches, allocation of data to scratchpad memory must be handled by software. The major gain is to enhance the pre-dictability of memory accesses latencies. A compile-time dy-namic allocation approach enables eviction and placement of data to the scratchpad memory at runtime. Previous dynamic scratchpad memory allocation ap-proaches aimed to reduce average-case program execution time or the energy consumption due to memory accesses. For real-time systems, worst-case execution time is the main metric to optimize. In this paper, we propose a WCET-directed algorithm to dynamically allocate static data and stack data of a pro-gram to scratchpad memory. The granularity of placement of memory transfers (e.g. on function, basic block bound-aries) is discussed from the perspective of its computation complexity and the quality of allocation. 1

    Evaluation of STT-MRAM main memory for HPC and real-time systems

    Get PDF
    It is questionable whether DRAM will continue to scale and will meet the needs of next-generation systems. Therefore, significant effort is invested in research and development of novel memory technologies. One of the candidates for nextgeneration memory is Spin-Transfer Torque Magnetic Random Access Memory (STT-MRAM). STT-MRAM is an emerging non-volatile memory with a lot of potential that could be exploited for various requirements of different computing systems. Being a novel technology, STT-MRAM devices are already approaching DRAM in terms of capacity, frequency and device size. Special STT-MRAM features such as intrinsic radiation hardness, non-volatility, zero stand-by power and capability to function in extreme temperatures also make it particularly suitable for aerospace, avionics and automotive applications. Despite of being a conceivable alternative for main memory technology, to this day, academic research of STT-MRAM main memory remains marginal. This is mainly due to the unavailability of publicly available detailed timing parameters of this novel technology, which are required to perform a cycle accurate main memory simulation. Some researchers adopt simplistic memory models to simulate main memory, but such models can introduce significant errors in the analysis of the overall system performance. Therefore, detailed timing parameters are a must-have for any evaluation or architecture exploration study of STT-MRAM main memory. These detailed parameters are not publicly available because STT-MRAM manufacturers are reluctant to release any delicate information on the technology. This thesis demonstrates an approach to perform a cycle accurate simulation of STT-MRAM main memory, being the first to release detailed timing parameters of this technology from academia, essentially enabling researchers to conduct reliable system level simulation of STT-MRAM using widely accepted existing simulation infrastructure. Our results show that, in HPC domain STT-MRAM provide performance comparable to DRAM. Results from the power estimation indicates that STT-MRAM power consumption increases significantly for Activation/Precharge power while Burst power increases moderately and Background power does not deviate much from DRAM. The thesis includes detailed STT-MRAM main memory timing parameters to the main repositories of DramSim2 and Ramulator, two of the most widely used and accepted state-of-the-art main memory simulators. The STT-MRAM timing parameters that has been originated as a part of this thesis, are till date the only reliable and publicly available timing information on this memory technology published from academia. Finally, the thesis analyzes the feasibility of using STT-MRAM in real-time embedded systems by investigating STT-MRAM main memory impact on average system performance and WCET. STT-MRAM's suitability for the real-time embedded systems is validated on benchmarks provided by the European Space Agency (ESA), EEMBC Autobench and MediaBench suite by analyzing performance and WCET impact. In quantitative terms, our results show that STT-MRAM main memory in real-time embedded systems provides performance and WCET comparable to conventional DRAM, while opening up opportunities to exploit various advantages.Es cuestionable si DRAM continuará escalando y cumplirá con las necesidades de los sistemas de la próxima generación. Por lo tanto, se invierte un esfuerzo significativo en la investigación y el desarrollo de nuevas tecnologías de memoria. Uno de los candidatos para la memoria de próxima generación es la Spin-Transfer Torque Magnetic Random Access Memory (STT-MRAM). STT-MRAM es una memoria no volátil emergente con un gran potencial que podría ser explotada para diversos requisitos de diferentes sistemas informáticos. Al ser una tecnología novedosa, los dispositivos STT-MRAM ya se están acercando a la DRAM en términos de capacidad, frecuencia y tamaño del dispositivo. Las características especiales de STTMRAM, como la dureza intrínseca a la radiación, la no volatilidad, la potencia de reserva cero y la capacidad de funcionar en temperaturas extremas, también lo hacen especialmente adecuado para aplicaciones aeroespaciales, de aviónica y automotriz. A pesar de ser una alternativa concebible para la tecnología de memoria principal, hasta la fecha, la investigación académica de la memoria principal de STT-MRAM sigue siendo marginal. Esto se debe principalmente a la falta de disponibilidad de los parámetros de tiempo detallados públicamente disponibles de esta nueva tecnología, que se requieren para realizar un ciclo de simulación de memoria principal precisa. Algunos investigadores adoptan modelos de memoria simplistas para simular la memoria principal, pero tales modelos pueden introducir errores significativos en el análisis del rendimiento general del sistema. Por lo tanto, los parámetros de tiempo detallados son indispensables para cualquier evaluación o estudio de exploración de la arquitectura de la memoria principal de STT-MRAM. Estos parámetros detallados no están disponibles públicamente porque los fabricantes de STT-MRAM son reacios a divulgar información delicada sobre la tecnología. Esta tesis demuestra un enfoque para realizar un ciclo de simulación precisa de la memoria principal de STT-MRAM, siendo el primero en lanzar parámetros de tiempo detallados de esta tecnología desde la academia, lo que esencialmente permite a los investigadores realizar una simulación confiable a nivel de sistema de STT-MRAM utilizando una simulación existente ampliamente aceptada infraestructura. Nuestros resultados muestran que, en el dominio HPC, STT-MRAM proporciona un rendimiento comparable al de la DRAM. Los resultados de la estimación de potencia indican que el consumo de potencia de STT-MRAM aumenta significativamente para la activation/Precharge power, mientras que la Burst power aumenta moderadamente y la Background power no se desvía mucho de la DRAM. La tesis incluye parámetros detallados de temporización memoria principal de STT-MRAM a los repositorios principales de DramSim2 y Ramulator, dos de los simuladores de memoria principal más avanzados y más utilizados y aceptados. Los parámetros de tiempo de STT-MRAM que se han originado como parte de esta tesis, son hasta la fecha la única información de tiempo confiable y disponible al público sobre esta tecnología de memoria publicada desde la academia. Finalmente, la tesis analiza la viabilidad de usar STT-MRAM en real-time embedded systems mediante la investigación del impacto de la memoria principal de STT-MRAM en el rendimiento promedio del sistema y WCET. La idoneidad de STTMRAM para los real-time embedded systems se valida en los applicaciones proporcionados por la European Space Agency (ESA), EEMBC Autobench y MediaBench, al analizar el rendimiento y el impacto de WCET. En términos cuantitativos, nuestros resultados muestran que la memoria principal de STT-MRAM en real-time embedded systems proporciona un desempeño WCET comparable al de una memoria DRAM convencional, al tiempo que abre oportunidades para explotar varias ventajas

    Contention-aware performance monitoring counter support for real-time MPSoCs

    Get PDF
    Tasks running in MPSoCs experience contention delays when accessing MPSoC’s shared resources, complicating task timing analysis and deriving execution time bounds. Understanding the Actual Contention Delay (ACD) each task suffers due to other corunning tasks, and the particular hardware shared resources in which contention occurs, is of prominent importance to increase confidence on derived execution time bounds of tasks. And, whenever those bounds are violated, ACD provides information on the reasons for overruns. Unfortunately, existing MPSoC designs considered in real-time domains offer limited hardware support to measure tasks’ ACD losing all these potential benefits. In this paper we propose the Contention Cycle Stack (CCS), a mechanism that extends performance monitoring counters to track specific events that allow estimating the ACD that each task suffers from every contending task on every hardware shared resource. We build the CCS using a set of specialized low-overhead Performance Monitoring Counters for the Cobham Gaisler GR740 (NGMP) MPSoC – used in the space domain – for which we show CCS’s benefits.The research leading to these results has received funding from the European Space Agency under contracts 4000109680, 4000110157 and NPI 4000102880, and the Ministry of Science and Technology of Spain under contract TIN-2015-65316-P. Jaume Abella has been partially supported by the Ministry of Economy and Competitiveness under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft

    Static analysis of WCET in a satellite software subsystem

    Get PDF
    This paper describes the authors? experience with static analysis of both WCET and stack usage of a satellite on-board software subsystem. The work is a continuation of a previous case study that used a dynamic WCET analysis tool on an earlier version of the same software system. In particular, the AbsInt aiT tool has been evaluated by analysing both C and Ada code generated by Simulink within the UPMSat-2 project. Some aspects of the aiT tool, specifically those dealing with SPARC register windows, are compared to another static analysis tool, Bound-T. The results of the analysis are discussed, and some conclusions on the use of static WCET analysis tools on the SPARC architecture are commented in the paper

    Tracing Hardware Monitors in the GR712RC Multicore Platform: Challenges and Lessons Learnt from a Space Case Study

    Get PDF
    The demand for increased computing performance is driving industry in critical-embedded systems (CES) domains, e.g. space, towards the use of multicores processors. Multicores, however, pose several challenges that must be addressed before their safe adoption in critical embedded domains. One of the prominent challenges is software timing analysis, a fundamental step in the verification and validation process. Monitoring and profiling solutions, traditionally used for debugging and optimization, are increasingly exploited for software timing in multicores. In particular, hardware event monitors related to requests to shared hardware resources are building block to assess and restraining multicore interference. Modern timing analysis techniques build on event monitors to track and control the contention tasks can generate each other in a multicore platform. In this paper we look into the hardware profiling problem from an industrial perspective and address both methodological and practical problems when monitoring a multicore application. We assess pros and cons of several profiling and tracing solutions, showing that several aspects need to be taken into account while considering the appropriate mechanism to collect and extract the profiling information from a multicore COTS platform. We address the profiling problem on a representative COTS platform for the aerospace domain to find that the availability of directly-accessible hardware counters is not a given, and it may be necessary to the develop specific tools that capture the needs of both the user’s and the timing analysis technique requirements. We report challenges in developing an event monitor tracing tool that works for bare-metal and RTEMS configurations and show the accuracy of the developed tool-set in profiling a real aerospace application. We also show how the profiling tools can be exploited, together with handcrafted benchmarks, to characterize the application behavior in terms of multicore timing interference.This work has been partially supported by a collaboration agreement between Thales Research and the Barcelona Supercomputing Center, and the European Research Council (ERC) under the EU’s Horizon 2020 research and innovation programme (grant agreement No. 772773). MINECO partially supported Jaume Abella under Ramon y Cajal postdoctoral fellowship (RYC2013-14717).Peer ReviewedPostprint (published version

    Increasing the reliability and applicability of measurement-based probabilistic timing analysis

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, Florianópolis, 2019Conforme a complexidade das arquiteturas computacionais aumenta para melhorar desempenho ou reduzir custos, o uso de processadores modernos em Sistemas de Tempo Real (STRs) é prejudicado cada vez mais pelo surgimento de efeitos temporais que dificultam a obtenção de limites confiáveis e precisos para os Worst-Case Execution Times (WCETs) de tarefas. A Análise Temporal Probabilística Baseada em Medições (ATPBM) visa determinar limites probabilísticos de WCET (i.e. pWCETs) aplicando a Teoria de Valores Extremos (TVE) sobre medições de tempos de execução, e é portanto promissora no tratamento da complexidade de hardware no projeto de STRs. Processadores temporalmente aleatorizados foram recentemente propostos para tornar o comportamento temporal de sistemas computacionais mais facilmente analisável através de ferramental probabilístico, e são projetados substituindo informações internas determinísticas ou especulativas por números (pseudo-)aleatórios. A pesquisa cujos resultados são apresentados nesta tese produziu contribuições em duas frentes distintas. Em primeiro lugar, foram propostos e aplicados métodos para avaliar a confiabilidade dos pWCETs produzidos pela ATPBM, baseados na coleta de grandes amostras de tempos de execução e na comparação (1) dos pWCETs com os maiores tempos de execução observados, e (2) das densidades de excedência dos pWCETs com seus valores esperados. Essas avaliações indicaram que modelos probabilísticos da TVE projetados para gerar margens mais precisas podem muitas vezes levar a subestimativas de pWCETs, e recomendou-se então que modelos sobrestimadores devem ser utilizados para obter-se pWCETs mais confiáveis. Em segundo lugar, avaliou-se a hipótese de que técnicas de escalonamento aleatorizado podem beneficiar a análise temporal de tarefas executadas em pipelines multithread através da ATPBM, por levarem os tempos de execução produzidos a atenderem às premissas básicas de aplicabilidade da técnica. Para tal, foram considerados tanto (A) um escalonador puramente aleatório, quanto (B) um escalonador aleatorizado capaz de limitar os efeitos temporais da interferência entre threads, sem comprometer sua analisabilidade pela ATPBM, através de um mecanismo de regulação de elegibilidade baseado em créditos.Abstract: As the complexity of computer architectures grows in order to improve performance and/or to reduce costs, the use of modern processors in the design of Real-Time Systems (RTSs) is increasingly hampered by the emergence of timing effects that challenge determining reliable and tight bounds for tasks' Worst-Case Execution Times (WCETs). The Measurement-Based Probabilistic Timing Analysis (MBPTA) technique aims determining probabilistic WCET bounds (i.e. pWCETs) by applying Extreme Value Theory (EVT) on tasks' execution time measurements, and is hence promising in handling hardware complexity issues within RTSs' design. Hardware-level time-randomized processors were recently proposed as a means to cause computing systems' timing behaviour to become more easily analysable through probabilistic tools, and are designed replacing deterministic or speculative internal information with (pseudo-)random numbers. The scientific research whose outcomes are presented in this thesis produced contributions on two distinct fronts. In first place, we proposed and applied methods for evaluating the reliability of pWCET estimates produced using MBPTA, based on collecting large execution time samples and then comparing (1) the pWCETs against the largest observed execution times, and (2) pWCETs' exceedance densities against their expected values. These evaluations led us to conclude that EVT probabilistic models intended to yield more precise bounds may often lead to pWCET underestimations, and we hence recommended that upper-bounding models should instead be used for deriving pWCETs with increased reliability. In second place, we evaluated the hypothesis that randomized scheduling techniques can benefit the timing analysis of tasks executed on multithread pipelines through MBPTA, by causing the yielded execution times to meet the technique's basic application requirements. For that, we considered both (A) a scheduler that employs a purely random policy, and (B) a randomized scheduler capable of limiting the timing effects of inter-thread interference, without compromising analysability, by using a credit-based eligibility regulation mechanism
    • …
    corecore