2,872 research outputs found

    Integrated Worst-Case Execution Time Estimation of Multicore Applications

    Get PDF
    Worst-case execution time (WCET) analysis has reached a high level of precision in the analysis of sequential programs executing on single-cores. In this paper we extend a state-of-the-art WCET analysis technique to compute tight WCETs estimates of parallel applications running on multicores. The proposed technique is termed integrated because it considers jointly the sequential code regions running on the cores and the communications between them. This allows to capture the hardware effects across code regions assigned to the same core, which significantly improves analysis precision. We demonstrate that our analysis produces tighter execution time bounds than classical techniques which first determine the WCET of sequential code regions and then compute the global response time by integrating communication costs. Comparison is done on two embedded control applications, where the gain is of 21% on average

    pTNoC: Probabilistically time-analyzable tree-based NoC for mixed-criticality systems

    Get PDF
    The use of networks-on-chip (NoC) in real-time safety-critical multicore systems challenges deriving tight worst-case execution time (WCET) estimates. This is due to the complexities in tightly upper-bounding the contention in the access to the NoC among running tasks. Probabilistic Timing Analysis (PTA) is a powerful approach to derive WCET estimates on relatively complex processors. However, so far it has only been tested on small multicores comprising an on-chip bus as communication means, which intrinsically does not scale to high core counts. In this paper we propose pTNoC, a new tree-based NoC design compatible with PTA requirements and delivering scalability towards medium/large core counts. pTNoC provides tight WCET estimates by means of asymmetric bandwidth guarantees for mixed-criticality systems with negligible impact on average performance. Finally, our implementation results show the reduced area and power costs of the pTNoC.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] under the PROXIMA Project (www.proxima-project.eu), grant agreement no 611085. This work has also been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2015-65316-P and the HiPEAC Network of Excellence. Mladen Slijepcevic is funded by the Obra Social Fundación la Caixa under grant Doctorado “la Caixa” - Severo Ochoa. Carles Hern´andez is jointly funded by the Spanish Ministry of Economy and Competitiveness (MINECO) and FEDER funds through grant TIN2014-60404-JIN. Jaume Abella has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft

    Improving early design stage timing modeling in multicore based real-time systems

    Get PDF
    This paper presents a modelling approach for the timing behavior of real-time embedded systems (RTES) in early design phases. The model focuses on multicore processors - accepted as the next computing platform for RTES - and in particular it predicts the contention tasks suffer in the access to multicore on-chip shared resources. The model presents the key properties of not requiring the application's source code or binary and having high-accuracy and low overhead. The former is of paramount importance in those common scenarios in which several software suppliers work in parallel implementing different applications for a system integrator, subject to different intellectual property (IP) constraints. Our model helps reducing the risk of exceeding the assigned budgets for each application in late design stages and its associated costs.This work has received funding from the European Space Agency under Project Reference AO=17722=13=NL=LvH, and has also been supported by the Spanish Ministry of Science and Innovation grant TIN2015-65316-P. Jaume Abella has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft

    Parallelism-Aware Memory Interference Delay Analysis for COTS Multicore Systems

    Full text link
    In modern Commercial Off-The-Shelf (COTS) multicore systems, each core can generate many parallel memory requests at a time. The processing of these parallel requests in the DRAM controller greatly affects the memory interference delay experienced by running tasks on the platform. In this paper, we model a modern COTS multicore system which has a nonblocking last-level cache (LLC) and a DRAM controller that prioritizes reads over writes. To minimize interference, we focus on LLC and DRAM bank partitioned systems. Based on the model, we propose an analysis that computes a safe upper bound for the worst-case memory interference delay. We validated our analysis on a real COTS multicore platform with a set of carefully designed synthetic benchmarks as well as SPEC2006 benchmarks. Evaluation results show that our analysis is more accurately capture the worst-case memory interference delay and provides safer upper bounds compared to a recently proposed analysis which significantly under-estimate the delay.Comment: Technical Repor

    Modelling bus contention during system early design stages

    Get PDF
    Reliably upperbounding contention in multicore shared resources is of prominent importance in the early design phases of critical real-time systems to properly allocate time budgets to applications. However, during early stages applications are not yet consolidated and IP constraints may prevent sharing them across providers, challenging the estimation of contention bounds. In this paper, we propose a model to estimate the increase in applications' execution time due to on-chip bus sharing when they simultaneously execute in a multicore. The model works with information derived from the execution of each application in isolation, hence, without the need to actually run applications simultaneously. The model improves inaccuracy with respect to the existing model, and tends to over-estimate. The latter, is very important to prevent that, during late design stages, applications miss their deadline when consolidated into the same multicore, causing costly system redesign.This work has been supported by the Spanish Ministry of Science and Innovation grant TIN2015-65316-P. Jaume Abella has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717. Carles Hernández is jointly funded by the Spanish Ministry of Economy and Competitiveness and FEDER funds through grant TIN2014-60404-JIN.Peer ReviewedPostprint (author's final draft

    Analysis of Dynamic Memory Bandwidth Regulation in Multi-core Real-Time Systems

    Full text link
    One of the primary sources of unpredictability in modern multi-core embedded systems is contention over shared memory resources, such as caches, interconnects, and DRAM. Despite significant achievements in the design and analysis of multi-core systems, there is a need for a theoretical framework that can be used to reason on the worst-case behavior of real-time workload when both processors and memory resources are subject to scheduling decisions. In this paper, we focus our attention on dynamic allocation of main memory bandwidth. In particular, we study how to determine the worst-case response time of tasks spanning through a sequence of time intervals, each with a different bandwidth-to-core assignment. We show that the response time computation can be reduced to a maximization problem over assignment of memory requests to different time intervals, and we provide an efficient way to solve such problem. As a case study, we then demonstrate how our proposed analysis can be used to improve the schedulability of Integrated Modular Avionics systems in the presence of memory-intensive workload.Comment: Accepted for publication in the IEEE Real-Time Systems Symposium (RTSS) 2018 conferenc

    DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car

    Full text link
    We present DeepPicar, a low-cost deep neural network based autonomous car platform. DeepPicar is a small scale replication of a real self-driving car called DAVE-2 by NVIDIA. DAVE-2 uses a deep convolutional neural network (CNN), which takes images from a front-facing camera as input and produces car steering angles as output. DeepPicar uses the same network architecture---9 layers, 27 million connections and 250K parameters---and can drive itself in real-time using a web camera and a Raspberry Pi 3 quad-core platform. Using DeepPicar, we analyze the Pi 3's computing capabilities to support end-to-end deep learning based real-time control of autonomous vehicles. We also systematically compare other contemporary embedded computing platforms using the DeepPicar's CNN-based real-time control workload. We find that all tested platforms, including the Pi 3, are capable of supporting the CNN-based real-time control, from 20 Hz up to 100 Hz, depending on hardware platform. However, we find that shared resource contention remains an important issue that must be considered in applying CNN models on shared memory based embedded computing platforms; we observe up to 11.6X execution time increase in the CNN based control loop due to shared resource contention. To protect the CNN workload, we also evaluate state-of-the-art cache partitioning and memory bandwidth throttling techniques on the Pi 3. We find that cache partitioning is ineffective, while memory bandwidth throttling is an effective solution.Comment: To be published as a conference paper at RTCSA 201

    A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

    Full text link
    Recent technological advances have greatly improved the performance and features of embedded systems. With the number of just mobile devices now reaching nearly equal to the population of earth, embedded systems have truly become ubiquitous. These trends, however, have also made the task of managing their power consumption extremely challenging. In recent years, several techniques have been proposed to address this issue. In this paper, we survey the techniques for managing power consumption of embedded systems. We discuss the need of power management and provide a classification of the techniques on several important parameters to highlight their similarities and differences. This paper is intended to help the researchers and application-developers in gaining insights into the working of power management techniques and designing even more efficient high-performance embedded systems of tomorrow

    Development and certification of mixed-criticality embedded systems based on probabilistic timing analysis

    Get PDF
    An increasing variety of emerging systems relentlessly replaces or augments the functionality of mechanical subsystems with embedded electronics. For quantity, complexity, and use, the safety of such subsystems is an increasingly important matter. Accordingly, those systems are subject to safety certification to demonstrate system's safety by rigorous development processes and hardware/software constraints. The massive augment in embedded processors' complexity renders the arduous certification task significantly harder to achieve. The focus of this thesis is to address the certification challenges in multicore architectures: despite their potential to integrate several applications on a single platform, their inherent complexity imperils their timing predictability and certification. Recently, the Measurement-Based Probabilistic Timing Analysis (MBPTA) technique emerged as an alternative to deal with hardware/software complexity. The innovation that MBPTA brings about is, however, a major step from current certification procedures and standards. The particular contributions of this Thesis include: (i) the definition of certification arguments for mixed-criticality integration upon multicore processors. In particular we propose a set of safety mechanisms and procedures as required to comply with functional safety standards. For timing predictability, (ii) we present a quantitative approach to assess the likelihood of execution-time exceedance events with respect to the risk reduction requirements on safety standards. To this end, we build upon the MBPTA approach and we present the design of a safety-related source of randomization (SoR), that plays a key role in the platform-level randomization needed by MBPTA. And (iii) we evaluate current certification guidance with respect to emerging high performance design trends like caches. Overall, this Thesis pushes the certification limits in the use of multicore and MBPTA technology in Critical Real-Time Embedded Systems (CRTES) and paves the way towards their adoption in industry.Una creciente variedad de sistemas emergentes reemplazan o aumentan la funcionalidad de subsistemas mecánicos con componentes electrónicos embebidos. El aumento en la cantidad y complejidad de dichos subsistemas electrónicos así como su cometido, hacen de su seguridad una cuestión de creciente importancia. Tanto es así que la comercialización de estos sistemas críticos está sujeta a rigurosos procesos de certificación donde se garantiza la seguridad del sistema mediante estrictas restricciones en el proceso de desarrollo y diseño de su hardware y software. Esta tesis trata de abordar los nuevos retos y dificultades dadas por la introducción de procesadores multi-núcleo en dichos sistemas críticos: aunque su mayor rendimiento despierta el interés de la industria para integrar múltiples aplicaciones en una sola plataforma, suponen una mayor complejidad. Su arquitectura desafía su análisis temporal mediante los métodos tradicionales y, asimismo, su certificación es cada vez más compleja y costosa. Con el fin de lidiar con estas limitaciones, recientemente se ha desarrollado una novedosa técnica de análisis temporal probabilístico basado en medidas (MBPTA). La innovación de esta técnica, sin embargo, supone un gran cambio cultural respecto a los estándares y procedimientos tradicionales de certificación. En esta línea, las contribuciones de esta tesis están agrupadas en tres ejes principales: (i) definición de argumentos de seguridad para la certificación de aplicaciones de criticidad-mixta sobre plataformas multi-núcleo. Se definen, en particular, mecanismos de seguridad, técnicas de diagnóstico y reacción de faltas acorde con el estándar IEC 61508 sobre una arquitectura multi-núcleo de referencia. Respecto al análisis temporal, (ii) presentamos la cuantificación de la probabilidad de exceder un límite temporal y su relación con los requisitos de reducción de riesgos derivados de los estándares de seguridad funcional. Con este fin, nos basamos en la técnica MBPTA y presentamos el diseño de una fuente de números aleatorios segura; un componente clave para conseguir las propiedades aleatorias requeridas por MBPTA a nivel de plataforma. Por último, (iii) extrapolamos las guías actuales para la certificación de arquitecturas multi-núcleo a una solución comercial de 8 núcleos y las evaluamos con respecto a las tendencias emergentes de diseño de alto rendimiento (caches). Con estas contribuciones, esta tesis trata de abordar los retos que el uso de procesadores multi-núcleo y MBPTA implican en el proceso de certificación de sistemas críticos de tiempo real y facilita, de esta forma, su adopción por la industria.Postprint (published version
    corecore