40 research outputs found

    Mixed Critical Automotive Embedded Applications on Multicores: A Safe Scheduling Approach for Dependability

    Get PDF
    International audienceMemory access durations on multicore architectures are highly variable, since concurrent accesses to memory by different cores induce time interferences. Consequently, critical software tasks may be delayed by noncritical ones, leading to deadline misses and possible catastrophic failures. We present an approach to tackle the implementation of mixed criticality workloads on multicore chips, focusing on task chains, i.e., sequences of tasks with end-to-end deadlines. Our main contribution is a Monitoring & Control System able to stop noncritical software execution in order to prevent memory interference and guarantee that critical tasks deadlines are met. This paper describes our approach, and the associated experimental framework to conduct experiments to analyze attainable real-time guarantees on a multicore platform

    Contention-Aware Dynamic Memory Bandwidth Isolation with Predictability in COTS Multicores: An Avionics Case Study

    Get PDF
    Airbus is investigating COTS multicore platforms for safety-critical avionics applications, pursuing helicopter-style autonomous and electric aircraft. These aircraft need to be ultra-lightweight for future mobility in the urban city landscape. As a step towards certification, Airbus identified the need for new methods that preserve the ARINC 653 single core schedule of a Helicopter Terrain Awareness and Warning System (HTAWS) application while scheduling additional safety-critical partitions on the other cores. As some partitions in the HTAWS application are memory-intensive, static memory bandwidth throttling may lead to slow down of such partitions or provide only little remaining bandwidth to the other cores. Thus, there is a need for dynamic memory bandwidth isolation. This poses new challenges for scheduling, as execution times and scheduling become interdependent: scheduling requires execution times as input, which depends on memory latencies and contention from memory accesses of other cores - which are determined by scheduling. Furthermore, execution times depend on memory access patterns. In this paper, we propose a method to solve this problem for slot-based time-triggered systems without requiring application source-code modifications using a number of dynamic memory bandwidth levels. It is NoC and DRAM controller contention-aware and based on the existing interference-sensitive WCET computation and the memory bandwidth throttling mechanism. It constructs schedule tables by assigning partitions and dynamic memory bandwidth to each slot on each core, considering worst case memory access patterns. Then at runtime, two servers - for processing time and memory bandwidth - run on each core, jointly controlling the contention between the cores and the amount of memory accesses per slot. As a proof-of-concept, we use a constraint solver to construct tables. Experiments on the P4080 COTS multicore platform, using a research OS from Airbus and EEMBC benchmarks, demonstrate that our proposed method enables preserving existing schedules on a core while scheduling additional safety-critical partitions on other cores, and meets dynamic memory bandwidth isolation requirements

    Development and certification of mixed-criticality embedded systems based on probabilistic timing analysis

    Get PDF
    An increasing variety of emerging systems relentlessly replaces or augments the functionality of mechanical subsystems with embedded electronics. For quantity, complexity, and use, the safety of such subsystems is an increasingly important matter. Accordingly, those systems are subject to safety certification to demonstrate system's safety by rigorous development processes and hardware/software constraints. The massive augment in embedded processors' complexity renders the arduous certification task significantly harder to achieve. The focus of this thesis is to address the certification challenges in multicore architectures: despite their potential to integrate several applications on a single platform, their inherent complexity imperils their timing predictability and certification. Recently, the Measurement-Based Probabilistic Timing Analysis (MBPTA) technique emerged as an alternative to deal with hardware/software complexity. The innovation that MBPTA brings about is, however, a major step from current certification procedures and standards. The particular contributions of this Thesis include: (i) the definition of certification arguments for mixed-criticality integration upon multicore processors. In particular we propose a set of safety mechanisms and procedures as required to comply with functional safety standards. For timing predictability, (ii) we present a quantitative approach to assess the likelihood of execution-time exceedance events with respect to the risk reduction requirements on safety standards. To this end, we build upon the MBPTA approach and we present the design of a safety-related source of randomization (SoR), that plays a key role in the platform-level randomization needed by MBPTA. And (iii) we evaluate current certification guidance with respect to emerging high performance design trends like caches. Overall, this Thesis pushes the certification limits in the use of multicore and MBPTA technology in Critical Real-Time Embedded Systems (CRTES) and paves the way towards their adoption in industry.Una creciente variedad de sistemas emergentes reemplazan o aumentan la funcionalidad de subsistemas mecánicos con componentes electrónicos embebidos. El aumento en la cantidad y complejidad de dichos subsistemas electrónicos así como su cometido, hacen de su seguridad una cuestión de creciente importancia. Tanto es así que la comercialización de estos sistemas críticos está sujeta a rigurosos procesos de certificación donde se garantiza la seguridad del sistema mediante estrictas restricciones en el proceso de desarrollo y diseño de su hardware y software. Esta tesis trata de abordar los nuevos retos y dificultades dadas por la introducción de procesadores multi-núcleo en dichos sistemas críticos: aunque su mayor rendimiento despierta el interés de la industria para integrar múltiples aplicaciones en una sola plataforma, suponen una mayor complejidad. Su arquitectura desafía su análisis temporal mediante los métodos tradicionales y, asimismo, su certificación es cada vez más compleja y costosa. Con el fin de lidiar con estas limitaciones, recientemente se ha desarrollado una novedosa técnica de análisis temporal probabilístico basado en medidas (MBPTA). La innovación de esta técnica, sin embargo, supone un gran cambio cultural respecto a los estándares y procedimientos tradicionales de certificación. En esta línea, las contribuciones de esta tesis están agrupadas en tres ejes principales: (i) definición de argumentos de seguridad para la certificación de aplicaciones de criticidad-mixta sobre plataformas multi-núcleo. Se definen, en particular, mecanismos de seguridad, técnicas de diagnóstico y reacción de faltas acorde con el estándar IEC 61508 sobre una arquitectura multi-núcleo de referencia. Respecto al análisis temporal, (ii) presentamos la cuantificación de la probabilidad de exceder un límite temporal y su relación con los requisitos de reducción de riesgos derivados de los estándares de seguridad funcional. Con este fin, nos basamos en la técnica MBPTA y presentamos el diseño de una fuente de números aleatorios segura; un componente clave para conseguir las propiedades aleatorias requeridas por MBPTA a nivel de plataforma. Por último, (iii) extrapolamos las guías actuales para la certificación de arquitecturas multi-núcleo a una solución comercial de 8 núcleos y las evaluamos con respecto a las tendencias emergentes de diseño de alto rendimiento (caches). Con estas contribuciones, esta tesis trata de abordar los retos que el uso de procesadores multi-núcleo y MBPTA implican en el proceso de certificación de sistemas críticos de tiempo real y facilita, de esta forma, su adopción por la industria.Postprint (published version

    WCET and Mixed-Criticality: What does Confidence in WCET Estimations Depend Upon?

    Get PDF

    Parallel Deferred Update Replication

    Full text link
    Deferred update replication (DUR) is an established approach to implementing highly efficient and available storage. While the throughput of read-only transactions scales linearly with the number of deployed replicas in DUR, the throughput of update transactions experiences limited improvements as replicas are added. This paper presents Parallel Deferred Update Replication (P-DUR), a variation of classical DUR that scales both read-only and update transactions with the number of cores available in a replica. In addition to introducing the new approach, we describe its full implementation and compare its performance to classical DUR and to Berkeley DB, a well-known standalone database

    Improving the Robustness of Redundant Execution with Register File Randomization

    Full text link
    [EN] Staggered Redundant execution (SRE) is a fault-tolerance mechanism that has been widely deployed in the context of safety-critical applications. SRE not only protects the system in the presence of faults but also helps relaxing safety requirements of individual elements. However, in this paper, we show that SRE does not effectively protect the system against a wide range of faults and thus, new mechanisms to increase the diversity of homogeneous cores are needed. In this paper, we propose Register File Randomization (RFR), a low-cost diversity mechanism that significantly increases the robustness of homogeneous multicores in front of common-cause faults (CCFs) and register file wearout. Our results show that RFR completely removes the failure rate for register file CCFs for certain workloads and reduces by a factor of 5X the impact of stress related register file aging for the workloads analysed. Our implementation requires less than 50 RTL lines of code and the area (FPGA logic) overhead of RFR is less than 0.2% of a 64-bit RISC-V core FPGA implementation.This work has received funding from the ECSEL Joint Undertaking (JU) under grant agreement No 877056 and the Agencia Estatal de Investigacion from Spain under grant agreement no. PCI2020-112092, and from the the European Unions Horizon 2020 research and innovation programme under grant agreement no. 871467.Tuzov, I.; Andreu, P.; Medina, L.; Picornell-Sanjuan, T.; Robles Martínez, A.; López Rodríguez, PJ.; Flich Cardo, J.... (2021). Improving the Robustness of Redundant Execution with Register File Randomization. IEEE. 1-9. https://doi.org/10.1109/ICCAD51958.2021.96434661

    JIT-Based cost analysis for dynamic program transformations

    Get PDF
    Tracing JIT compilation generates units of compilation that are easy to analyse and are known to execute frequently. The AJITPar project investigates whether the information in JIT traces can be used to dynamically transform programs for a specific parallel architecture. Hence a lightweight cost model is required for JIT traces. This paper presents the design and implementation of a system for extracting JIT trace information from the Pycket JIT compiler. We define three increasingly parametric cost models for Pycket traces. We determine the best weights for the cost model parameters using linear regression. We evaluate the effectiveness of the cost models for predicting the relative costs of transformed programs

    Multi-core devices for safety-critical systems: a survey

    Get PDF
    Multi-core devices are envisioned to support the development of next-generation safety-critical systems, enabling the on-chip integration of functions of different criticality. This integration provides multiple system-level potential benefits such as cost, size, power, and weight reduction. However, safety certification becomes a challenge and several fundamental safety technical requirements must be addressed, such as temporal and spatial independence, reliability, and diagnostic coverage. This survey provides a categorization and overview at different device abstraction levels (nanoscale, component, and device) of selected key research contributions that support the compliance with these fundamental safety requirements.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness under grant TIN2015-65316-P, Basque Government under grant KK-2019-00035 and the HiPEAC Network of Excellence. The Spanish Ministry of Economy and Competitiveness has also partially supported Jaume Abella under Ramon y Cajal postdoctoral fellowship (RYC-2013-14717).Peer ReviewedPostprint (author's final draft

    Modeling high-performance wormhole NoCs for critical real-time embedded systems

    Get PDF
    Manycore chips are a promising computing platform to cope with the increasing performance needs of critical real-time embedded systems (CRTES). However, manycores adoption by CRTES industry requires understanding task's timing behavior when their requests use manycore's network-on-chip (NoC) to access hardware shared resources. This paper analyzes the contention in wormhole-based NoC (wNoC) designs - widely implemented in the high-performance domain - for which we introduce a new metric: worst-contention delay (WCD) that captures wNoC impact on worst-case execution time (WCET) in a tighter manner than the existing metric, worst-case traversal time (WCTT). Moreover, we provide an analytical model of the WCD that requests can suffer in a wNoC and we validate it against wNoC designs resembling those in the Tilera-Gx36 and the Intel-SCC 48-core processors. Building on top of our WCD analytical model, we analyze the impact on WCD that different design parameters such as the number of virtual channels, and we make a set of recommendations on what wNoC setups to use in the context of CRTES.Peer ReviewedPostprint (author's final draft
    corecore