251 research outputs found

    Restart-Based Fault-Tolerance: System Design and Schedulability Analysis

    Full text link
    Embedded systems in safety-critical environments are continuously required to deliver more performance and functionality, while expected to provide verified safety guarantees. Nonetheless, platform-wide software verification (required for safety) is often expensive. Therefore, design methods that enable utilization of components such as real-time operating systems (RTOS), without requiring their correctness to guarantee safety, is necessary. In this paper, we propose a design approach to deploy safe-by-design embedded systems. To attain this goal, we rely on a small core of verified software to handle faults in applications and RTOS and recover from them while ensuring that timing constraints of safety-critical tasks are always satisfied. Faults are detected by monitoring the application timing and fault-recovery is achieved via full platform restart and software reload, enabled by the short restart time of embedded systems. Schedulability analysis is used to ensure that the timing constraints of critical plant control tasks are always satisfied in spite of faults and consequent restarts. We derive schedulability results for four restart-tolerant task models. We use a simulator to evaluate and compare the performance of the considered scheduling models

    Real-time operating system support for multicore applications

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, Florianópolis, 2014Plataformas multiprocessadas atuais possuem diversos níveis da memória cache entre o processador e a memória principal para esconder a latência da hierarquia de memória. O principal objetivo da hierarquia de memória é melhorar o tempo médio de execução, ao custo da previsibilidade. O uso não controlado da hierarquia da cache pelas tarefas de tempo real impacta a estimativa dos seus piores tempos de execução, especialmente quando as tarefas de tempo real acessam os níveis da cache compartilhados. Tal acesso causa uma disputa pelas linhas da cache compartilhadas e aumenta o tempo de execução das aplicações. Além disso, essa disputa na cache compartilhada pode causar a perda de prazos, o que é intolerável em sistemas de tempo real críticos. O particionamento da memória cache compartilhada é uma técnica bastante utilizada em sistemas de tempo real multiprocessados para isolar as tarefas e melhorar a previsibilidade do sistema. Atualmente, os estudos que avaliam o particionamento da memória cache em multiprocessadores carecem de dois pontos fundamentais. Primeiro, o mecanismo de particionamento da cache é tipicamente implementado em um ambiente simulado ou em um sistema operacional de propósito geral. Consequentemente, o impacto das atividades realizados pelo núcleo do sistema operacional, tais como o tratamento de interrupções e troca de contexto, no particionamento das tarefas tende a ser negligenciado. Segundo, a avaliação é restrita a um escalonador global ou particionado, e assim não comparando o desempenho do particionamento da cache em diferentes estratégias de escalonamento. Ademais, trabalhos recentes confirmaram que aspectos da implementação do SO, tal como a estrutura de dados usada no escalonamento e os mecanismos de tratamento de interrupções, impactam a escalonabilidade das tarefas de tempo real tanto quanto os aspectos teóricos. Entretanto, tais estudos também usaram sistemas operacionais de propósito geral com extensões de tempo real, que afetamos sobre custos de tempo de execução observados e a escalonabilidade das tarefas de tempo real. Adicionalmente, os algoritmos de escalonamento tempo real para multiprocessadores atuais não consideram cenários onde tarefas de tempo real acessam as mesmas linhas da cache, o que dificulta a estimativa do pior tempo de execução. Esta pesquisa aborda os problemas supracitados com as estratégias de particionamento da cache e com os algoritmos de escalonamento tempo real multiprocessados da seguinte forma. Primeiro, uma infraestrutura de tempo real para multiprocessadores é projetada e implementada em um sistema operacional embarcado. A infraestrutura consiste em diversos algoritmos de escalonamento tempo real, tais como o EDF global e particionado, e um mecanismo de particionamento da cache usando a técnica de coloração de páginas. Segundo, é apresentada uma comparação em termos da taxa de escalonabilidade considerando o sobre custo de tempo de execução da infraestrutura criada e de um sistema operacional de propósito geral com extensões de tempo real. Em alguns casos, o EDF global considerando o sobre custo do sistema operacional embarcado possui uma melhor taxa de escalonabilidade do que o EDF particionado com o sobre custo do sistema operacional de propósito geral, mostrando claramente como diferentes sistemas operacionais influenciam os escalonadores de tempo real críticos em multiprocessadores. Terceiro, é realizada uma avaliação do impacto do particionamento da memória cache em diversos escalonadores de tempo real multiprocessados. Os resultados desta avaliação indicam que um sistema operacional "leve" não compromete as garantias de tempo real e que o particionamento da cache tem diferentes comportamentos dependendo do escalonador e do tamanho do conjunto de trabalho das tarefas. Quarto, é proposto um algoritmo de particionamento de tarefas que atribui as tarefas que compartilham partições ao mesmo processador. Os resultados mostram que essa técnica de particionamento de tarefas reduz a disputa pelas linhas da cache compartilhadas e provê garantias de tempo real para sistemas críticos. Finalmente, é proposto um escalonador de tempo real de duas fases para multiprocessadores. O escalonador usa informações coletadas durante o tempo de execução das tarefas através dos contadores de desempenho em hardware. Com base nos valores dos contadores, o escalonador detecta quando tarefas de melhor esforço o interferem com tarefas de tempo real na cache. Assim é possível impedir que tarefas de melhor esforço acessem as mesmas linhas da cache que tarefas de tempo real. O resultado desta estratégia de escalonamento é o atendimento dos prazos críticos e não críticos das tarefas de tempo real.Abstracts: Modern multicore platforms feature multiple levels of cache memory placed between the processor and main memory to hide the latency of ordinary memory systems. The primary goal of this cache hierarchy is to improve average execution time (at the cost of predictability). The uncontrolled use of the cache hierarchy by realtime tasks may impact the estimation of their worst-case execution times (WCET), specially when real-time tasks access a shared cache level, causing a contention for shared cache lines and increasing the application execution time. This contention in the shared cache may leadto deadline losses, which is intolerable particularly for hard real-time (HRT) systems. Shared cache partitioning is a well-known technique used in multicore real-time systems to isolate task workloads and to improve system predictability. Presently, the state-of-the-art studies that evaluate shared cache partitioning on multicore processors lack two key issues. First, the cache partitioning mechanism is typically implemented either in a simulated environment or in a general-purpose OS (GPOS), and so the impact of kernel activities, such as interrupt handlers and context switching, on the task partitions tend to be overlooked. Second, the evaluation is typically restricted to either a global or partitioned scheduler, thereby by falling to compare the performance of cache partitioning when tasks are scheduled by different schedulers. Furthermore, recent works have confirmed that OS implementation aspects, such as the choice of scheduling data structures and interrupt handling mechanisms, impact real-time schedulability as much as scheduling theoretic aspects. However, these studies also used real-time patches applied into GPOSes, which affects the run-time overhead observed in these works and consequently the schedulability of real-time tasks. Additionally, current multicore scheduling algorithms do not consider scenarios where real-time tasks access the same cache lines due to true or false sharing, which also impacts the WCET. This thesis addresses these aforementioned problems with cache partitioning techniques and multicore real-time scheduling algorithms as following. First, a real-time multicore support is designed and implemented on top of an embedded operating system designed from scratch. This support consists of several multicore real-time scheduling algorithms, such as global and partitioned EDF, and a cache partitioning mechanism based on page coloring. Second, it is presented a comparison in terms of schedulability ratio considering the run-time overhead of the implemented RTOS and a GPOS patched with real-time extensions. In some cases, Global-EDF considering the overhead of the RTOS is superior to Partitioned-EDF considering the overhead of the patched GPOS, which clearly shows how different OSs impact hard realtime schedulers. Third, an evaluation of the cache partitioning impacton partitioned, clustered, and global real-time schedulers is performed.The results indicate that a lightweight RTOS does not impact real-time tasks, and shared cache partitioning has different behavior depending on the scheduler and the task's working set size. Fourth, a task partitioning algorithm that assigns tasks to cores respecting their usage of cache partitions is proposed. The results show that by simply assigning tasks that shared cache partitions to the same processor, it is possible to reduce the contention for shared cache lines and to provideHRT guarantees. Finally, a two-phase multicore scheduler that provides HRT and soft real-time (SRT) guarantees is proposed. It is shown that by using information from hardware performance counters at run-time, the RTOS can detect when best-effort tasks interfere with real-time tasks in the shared cache. Then, the RTOS can prevent best effort tasks from interfering with real-time tasks. The results also show that the assignment of exclusive partitions to HRT tasks together with the two-phase multicore scheduler provides HRT and SRT guarantees, even when best-effort tasks share partitions with real-time tasks

    Cache-Aware Real-Time Virtualization

    Get PDF
    Virtualization has been adopted in diverse computing environments, ranging from cloud computing to embedded systems. It enables the consolidation of multi-tenant legacy systems onto a multicore processor for Size, Weight, and Power (SWaP) benefits. In order to be adopted in timing-critical systems, virtualization must provide real-time guarantee for tasks and virtual machines (VMs). However, existing virtualization technologies cannot offer such timing guarantee. Tasks in VMs can interfere with each other through shared hardware components. CPU cache, in particular, is a major source of interference that is hard to analyze or manage. In this work, we focus on challenges of the impact of cache-related interferences on the real-time guarantee of virtualization systems. We propose the cache-aware real-time virtualization that provides both system techniques and theoretical analysis for tackling the challenges. We start with the challenge of the private cache overhead and propose the private cache-aware compositional analysis. To tackle the challenge of the shared cache interference, we start with non-virtualization systems and propose a shared cache-aware scheduler for operating systems to co-allocate both CPU and cache resources to tasks and develop the analysis. We then investigate virtualization systems and propose a dynamic cache management framework that hierarchically allocates shared cache to tasks. After that, we further investigate the resource allocation and analysis technique that considers not only cache resource but also CPU and memory bandwidth resources. Our solutions are applicable to commodity hardware and are essential steps to advance virtualization technology into timing-critical systems

    A survey of techniques for reducing interference in real-time applications on multicore platforms

    Get PDF
    This survey reviews the scientific literature on techniques for reducing interference in real-time multicore systems, focusing on the approaches proposed between 2015 and 2020. It also presents proposals that use interference reduction techniques without considering the predictability issue. The survey highlights interference sources and categorizes proposals from the perspective of the shared resource. It covers techniques for reducing contentions in main memory, cache memory, a memory bus, and the integration of interference effects into schedulability analysis. Every section contains an overview of each proposal and an assessment of its advantages and disadvantages.This work was supported in part by the Comunidad de Madrid Government "Nuevas TĂ©cnicas de Desarrollo de Software de Tiempo Real Embarcado Para Plataformas. MPSoC de PrĂłxima GeneraciĂłn" under Grant IND2019/TIC-17261

    Ordonnancement des systèmes avec différents niveaux de criticité

    Get PDF
    Real-time safety-critical systems must complete their tasks within a given time limit. Failure to successfully perform their operations, or missing a deadline, can have severe consequences such as destruction of property and/or loss of life. Examples of such systems include automotive systems, drones and avionics among others. Safety guarantees must be provided before these systems can be deemed usable. This is usually done through certification performed by a certification authority.Safety evaluation and certification are complicated and costly even for smaller systems.One answer to these difficulties is the isolation of the critical functionality. Executing tasks of different criticalities on separate platforms prevents non-critical tasks from interfering with critical ones, provides a higher guaranty of safety and simplifies the certification process limiting it to only the critical functions. But this separation, in turn, introduces undesirable results portrayed by an inefficient resource utilization, an increase in the cost, weight, size and energy consumption which can put a system in a competitive disadvantage.To overcome the drawbacks of isolation, Mixed Criticality (MC) systems can be used. These systems allow functionalities with different criticalities to execute on the same platform. In 2007, Vestal proposed a model to represent MC-systems where tasks have multiple Worst Case Execution Times (WCETs), one for each criticality level. In addition, correctness conditions for scheduling policies were formally defined, allowing lower criticality jobs to miss deadlines or be even dropped in cases of failure or emergency situations.The introduction of multiple WCETs and different conditions for correctness increased the difficulty of the scheduling problem for MC-systems. Conventional scheduling policies and schedulability tests proved inadequate and the need for new algorithms arose. Since then, a lot of work has been done in this field.In this thesis, we contribute to the study of schedulability in MC-systems. The workload of a system is represented as a set of jobs that can describe the execution over the hyper-period of tasks or over a duration in time. This model allows us to study the viability of simulation-based correctness tests in MC-systems. We show that simulation tests can still be used in mixed-criticality systems, but in this case, the schedulability of the worst case scenario is no longer sufficient to guarantee the schedulability of the system even for the fixed priority scheduling case. We show that scheduling policies are not predictable in general, and define the concept of weak-predictability for MC-systems. We prove that a specific class of fixed priority policies are weakly predictable and propose two simulation-based correctness tests that work for weakly-predictable policies.We also demonstrate that contrary to what was believed, testing for correctness can not be done only through a linear number of preemptions.The majority of the related work focuses on systems of two criticality levels due to the difficulty of the problem. But for automotive and airborne systems, industrial standards define four or five criticality levels, which motivated us to propose a scheduling algorithm that schedules mixed-criticality systems with theoretically any number of criticality levels. We show experimentally that it has higher success rates compared to the state of the art.We illustrate how our scheduling algorithm, or any algorithm that generates a single time-triggered table for each criticality mode, can be used as a recovery strategy to ensure the safety of the system in case of certain failures.Finally, we propose a high level concurrency language and a model for designing an MC-system with coarse grained multi-core interference.Les systèmes temps-réel critiques doivent exécuter leurs tâches dans les délais impartis. En cas de défaillance, des événements peuvent avoir des catastrophes économiques. Des classifications des défaillances par rapport aux niveaux des risques encourus ont été établies, en particulier dans les domaines des transports aéronautique et automobile. Des niveaux de criticité sont attribués aux différentes fonctions des systèmes suivant les risques encourus lors d'une défaillance et des probabilités d'apparition de celles-ci. Ces différents niveaux de criticité influencent les choix d'architecture logicielle et matérielle ainsi que le type de composants utilisés pour sa réalisation. Les systèmes temps-réels modernes ont tendance à intégrer sur une même plateforme de calcul plusieurs applications avec différents niveaux de criticité. Cette intégration est nécessaire pour des systèmes modernes comme par exemple les drones (UAV) afin de réduire le coût, le poids et la consommation d'énergie. Malheureusement, elle conduit à des difficultés importantes lors de leurs conceptions. En plus, ces systèmes doivent être certifiés en prenant en compte ces différents niveaux de criticités.Il est bien connu que le problème d'ordonnancement des systèmes avec différents niveaux de criticités représente un des plus grand défi dans le domaine de systèmes temps-réel. Les techniques traditionnelles proposent comme solution l’isolation complète entre les niveaux de criticité ou bien une certification globale au plus haut niveau. Malheureusement, une telle solution conduit à une mauvaise des ressources et à la perte de l’avantage de cette intégration. En 2007, Vestal a proposé un modèle pour représenter les systèmes avec différents niveaux de criticité dont les tâches ont plusieurs temps d’exécution, un pour chaque niveau de criticité. En outre, les conditions de validité des stratégies d’ordonnancement ont été définies de manière formelle, permettant ainsi aux tâches les moins critiques d’échapper aux délais, voire d’être abandonnées en cas de défaillance ou de situation d’urgence.Les politiques de planification conventionnelles et les tests d’ordonnoncement se sont révélés inadéquats.Dans cette thèse, nous contribuons à l’étude de l’ordonnancement dans les systèmes avec différents niveaux de criticité. La surcharge d'un système est représentée sous la forme d'un ensemble de tâches pouvant décrire l'exécution sur l'hyper-période de tâches ou sur une durée donnée. Ce modèle nous permet d’étudier la viabilité des tests de correction basés sur la simulation pour les systèmes avec différents niveaux de criticité. Nous montrons que les tests de simulation peuvent toujours être utilisés pour ces systèmes, et la possibilité de l’ordonnancement du pire des scénarios ne suffit plus, même pour le cas de l’ordonnancement avec priorité fixe. Nous montrons que les politiques d'ordonnancement ne sont généralement pas prévisibles. Nous définissons le concept de faible prévisibilité pour les systèmes avec différents niveaux de criticité et nous montrons ensuite qu'une classe spécifique de stratégies à priorité fixe sont faiblement prévisibles. Nous proposons deux tests de correction basés sur la simulation qui fonctionnent pour des stratégies faiblement prévisibles.Nous montrons également que, contrairement à ce que l’on croyait, le contrôle de l’exactitude ne peut se faire que par l’intermédiaire d’un nombre linéaire de préemptions.La majorité des travaux reliés à notre domaine portent sur des systèmes à deux niveaux de criticité en raison de la difficulté du problème. Mais pour les systèmes automobiles et aériens, les normes industrielles définissent quatre ou cinq niveaux de criticité, ce qui nous a motivés à proposer un algorithme de planification qui planifie les systèmes à criticité mixte avec théoriquement un nombre quelconque de niveaux de criticité. Nous montrons expérimentalement que le taux de réussite est supérieur à celui de l’état de la technique

    Cache-aware Interfaces for Compositional Real-Time Systems

    Get PDF
    Interface-based compositional analysis is by now a fairly established area of research in real-time systems. However, current research has not yet fully considered practical aspects, such as the effects of cache interferences on multicore platforms. This position paper discusses the analysis challenges and motivates the need for cache scheduling in this setting, and it highlights several research questions towards cache-aware interfaces for compositional systems on multicore platforms

    Escalonar sistemas de tempo-real de alta crĂ­ticalidade

    Get PDF
    Cyclic executives are used to schedule safety-critical real-time systems because of their determinism, simplicity, and efficiency. One major challenge of the cyclic executive model is to produce the cyclic scheduling timetable. This problem is related to the bin-packing problem [34] and is NP-Hard in the strong sense. Unnecessary context switches within the scheduling table can introduce significant overhead; in IMA (Integrated Modular Avionics), cache-related overheads can increase task execution times up to 33% [18]. Developed in the context of the Software Engineering Master’s Degree at ISEP, the Polytechnic Institute of Engineering in Porto Portugal, this thesis contains two contributions to the scheduling literature. The first is a precise and exact approach to computing the slack of a job set that is schedule policy independent. The method introduces several operations to update and maintain the slack at runtime, ensuring the slack of all jobs is valid and coherent. The second contribution is the definition of a state-of-the-art preemptive scheduling algorithm focused on minimizing the number of system preemptions for real-time safety-critical applications within a reasonable amount of time. Both contributions have been implemented and extensively tested in scala. Experimental results suggest our scheduling algorithm has similar non-preemptive schedulability ratio than Chain Window RM [69], yet lower ratio in high utilizations than Chain Window EDF [69] and BB-Moore [68]. For ask sets that failed to be scheduled non-preemptively, 98-99% of all jobs are scheduled without preemptions. Considering the fact that our scheduler is preemptive, being able to compete with non-preemptive schedulers is an excellent result indeed. In terms of execution time, our proposal is multiple orders of magnitude faster than the aforementioned algorithms. Both contributions of this work are planned to be presented at future conferences such as RTSS@Work and RTAS
    • …
    corecore