161 research outputs found

    Zuverlässige und Energieeffiziente gemischt-kritische Echtzeit On-Chip Systeme

    Get PDF
    Multi- and many-core embedded systems are increasingly becoming the target for many applications that require high performance under varying conditions. A resulting challenge is the control, and reliable operation of such complex multiprocessing architectures under changes, e.g., high temperature and degradation. In mixed-criticality systems where many applications with varying criticalities are consolidated on the same execution platform, fundamental isolation requirements to guarantee non-interference of critical functions are crucially important. While Networks-on-Chip (NoCs) are the prevalent solution to provide scalable and efficient interconnects for the multiprocessing architectures, their associated energy consumption has immensely increased. Specifically, hard real-time NoCs must manifest limited energy consumption as thermal runaway in such a core shared resource jeopardizes the whole system guarantees. Thus, dynamic energy management of NoCs, as opposed to the related work static solutions, is highly necessary to save energy and decrease temperature, while preserving essential temporal requirements. In this thesis, we introduce a centralized management to provide energy-aware NoCs for hard real-time systems. The design relies on an energy control network, developed on top of an existing switch arbitration network to allow isolation between energy optimization and data transmission. The energy control layer includes local units called Power-Aware NoC controllers that dynamically optimize NoC energy depending on the global state and applications’ temporal requirements. Furthermore, to adapt to abnormal situations that might occur in the system due to degradation, we extend the concept of NoC energy control to include the entire system scope. That is, online resource management employing hierarchical control layers to treat system degradation (imminent core failures) is supported. The mechanism applies system reconfiguration that involves workload migration. For mixed-criticality systems, it allows flexible boundaries between safety-critical and non-critical subsystems to safely apply the reconfiguration, preserving fundamental safety requirements and temporal predictability. Simulation and formal analysis-based experiments on various realistic usecases and benchmarks are conducted showing significant improvements in NoC energy-savings and in treatment of system degradation for mixed-criticality systems improving dependability over the status quo.Eingebettete Many- und Multi-core-Systeme werden zunehmend das Ziel für Anwendungen, die hohe Anfordungen unter unterschiedlichen Bedinungen haben. Für solche hochkomplexed Multi-Prozessor-Systeme ist es eine grosse Herausforderung zuverlässigen Betrieb sicherzustellen, insbesondere wenn sich die Umgebungseinflüsse verändern. In Systeme mit gemischter Kritikalität, in denen viele Anwendungen mit unterschiedlicher Kritikalität auf derselben Ausführungsplattform bedient werden müssen, sind grundlegende Isolationsanforderungen zur Gewährleistung der Nichteinmischung kritischer Funktionen von entscheidender Bedeutung. Während On-Chip Netzwerke (NoCs) häufig als skalierbare Verbindung für die Multiprozessor-Architekturen eingesetzt werden, ist der damit verbundene Energieverbrauch immens gestiegen. Daher sind dynamische Plattformverwaltungen, im Gegensatz zu den statischen, zwingend notwendig, um ein System an die oben genannten Veränderungen anzupassen und gleichzeitig Timing zu gewährleisten. In dieser Arbeit entwickeln wir energieeffiziente NoCs für harte Echtzeitsysteme. Das Design basiert auf einem Energiekontrollnetzwerk, das auf einem bestehenden Switch-Arbitration-Netzwerk entwickelt wurde, um eine Isolierung zwischen Energieoptimierung und Datenübertragung zu ermöglichen. Die Energiesteuerungsschicht umfasst lokale Einheiten, die als Power-Aware NoC-Controllers bezeichnet werden und die die NoC-Energie in Abhängigkeit vom globalen Zustand und den zeitlichen Anforderungen der Anwendungen optimieren. Darüber hinaus wird das Konzept der NoC-Energiekontrolle zur Anpassung an Anomalien, die aufgrund von Abnutzung auftreten können, auf den gesamten Systemumfang ausgedehnt. Online- Ressourcenverwaltungen, die hierarchische Kontrollschichten zur Behandlung Abnutzung (drohender Kernausfälle) einsetzen, werden bereitgestellt. Bei Systemen mit gemischter Kritikalität erlaubt es flexible Grenzen zwischen sicherheitskritischen und unkritischen Subsystemen, um die Rekonfiguration sicher anzuwenden, wobei grundlegende Sicherheitsanforderungen erhalten bleiben und Timing Vorhersehbarkeit. Experimente werden auf der Basis von Simulationen und formalen Analysen zu verschiedenen realistischen Anwendungsfallen und Benchmarks durchgeführt, die signifikanten Verbesserungen bei On-Chip Netzwerke-Energieeinsparungen und bei der Behandlung von Abnutzung für Systeme mit gemischter Kritikalität zur Verbesserung die Systemstabilität gegenüber dem bisherigen Status quo zeigen

    Fault management via dynamic reconfiguration for integrated modular avionics

    Get PDF
    The purpose of this research is to investigate fault management methodologies within Integrated Modular Avionics (IMA) systems, and develop techniques by which the use of dynamic reconfiguration can be implemented to restore higher levels of systems redundancy in the event of a systems fault. A proposed concept of dynamic configuration has been implemented on a test facility that allows controlled injection of common faults to a representative IMA system. This facility allows not only the observation of the response of the system management activities to manage the fault, but also analysis of real time data across the network to ensure distributed control activities are maintained. IMS technologies have evolved as a feasible direction for the next generation of avionic systems. Although federated systems are logical to design, certify and implement, they have some inherent limitations that are not cost beneficial to the customer over long life-cycles of complex systems, and hence the fundamental modular design, i.e. common processors running modular software functions, provides a flexibility in terms of configuration, implementation and upgradability that cannot be matched by well-established federated avionic system architectures. For example, rapid advances of computing technology means that dedicated hardware can become outmoded by component obsolescence which almost inevitably makes replacements unavailable during normal life-cycles of most avionic systems. To replace the obsolete part with a newer design involves a costly re-design and re-certification of any relevant or interacting functions with this unit. As such, aircraft are often known to go through expensive mid-life updates to upgrade all avionics systems. In contrast, a higher frequency of small capability upgrades would maximise the product performance, including cost of development and procurement, in constantly changing platform deployment environments. IMA is by no means a new concept and work has been carried out globally in order to mature the capability. There are even examples where this technology has been implemented as subsystems on service aircraft. However, IMA flexible configuration properties are yet to be exploited to their full extent; it is feasible that identification of faults or failures within the system would lead to the exploitation of these properties in order to dynamically reconfigure and maintain high levels of redundancy in the event of component failure. It is also conceivable to install redundant components such that an IMS can go through a process of graceful degradation, whereby the system accommodates a number of active failures, but can still maintain appropriate levels of reliability and service. This property extends the average maintenance-free operating period, ensuring that the platform has considerably less unscheduled down time and therefore increased availability. The content of this research work involved a number of key activities in order to investigate the feasibility of the issues outlined above. The first was the creation of a representative IMA system and the development of a systems management capability that performs the required configuration controls. The second aspect was the development of hardware test rig in order to facilitate a tangible demonstration of the IMA capability. A representative IMA was created using LabVIEW Embedded Tool Suit (ETS) real time operating system for minimal PC systems. Although this required further code written to perform IMS middleware functions and does not match up to the stringent air safety requirements, it provided a suitable test bed to demonstrate systems management capabilities. The overall IMA was demonstrated with a 100kg scale Maglev vehicle as a test subject. This platform provides a challenging real-time control problem, analogous to an aircraft flight control system, requiring the calculation of parallel control loops at a high sampling rate in order to maintain magnetic suspension. Although the dynamic properties of the test rig are not as complex as a modern aircraft, it has much less stringent operating requirements and therefore substantially less risk associated with failure to provide service. The main research contributions for the PhD are: 1.A solution for the dynamic reconfiguration problem for assigning required systems functions (namely a distributed, real-time control function with redundant processing channels) to available computing resources whilst protecting the functional concurrency and time critical needs of the control actions. 2.A systems management strategy that utilises the dynamic reconfiguration properties of an IMA System to restore high levels of redundancy in the presence of failures. The conclusion summarises the level of success of the implemented system in terms of an appropriate dynamic reconfiguration to the response of a fault signal. In addition, it highlights the issues with using an IMA to as a solution to operational goals of the target hardware, in terms of design and build complexity, overhead and resources

    Timing Predictability in Future Multi-Core Avionics Systems

    Full text link

    Network-on-Chip -based Multi-Processor System-on-Chip: Towards Mixed-Criticality System Certification

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Safety-Critical Communication in Avionics

    Get PDF
    The aircraft of today use electrical fly-by-wire systems for manoeuvring. These safety-critical distributed systems are called flight control systems and put high requirements on the communication networks that interconnect the parts of the systems. Reliability, predictability, flexibility, low weight and cost are important factors that all need to be taken in to consideration when designing a safety-critical communication system. In this thesis certification issues, requirements in avionics, fault management, protocols and topologies for safety-critical communication systems in avionics are discussed and investigated. The protocols that are investigated in this thesis are: TTP/C, FlexRay and AFDX, as a reference protocol MIL-STD-1553 is used. As reference architecture analogue point-to-point is used. The protocols are described and evaluated regarding features such as services, maturity, supported physical layers and topologies.Pros and cons with each protocol are then illustrated by a theoretical implementation of a flight control system that uses each protocol for the highly critical communication between sensors, actuators and flight computers.The results show that from a theoretical point of view TTP/C could be used as a replacement for a point-to-point flight control system. However, there are a number of issues regarding the physical layer that needs to be examined. Finally a TTP/C cluster has been implemented and basic functionality tests have been conducted. The plan was to perform tests on delays, start-up time and reintegration time but the time to acquire the proper hardware for these tests exceeded the time for the thesis work. More advanced testing will be continued here at Saab beyond the time frame of this thesis

    A Primer on Architectural Level Fault Tolerance

    Get PDF
    This paper introduces the fundamental concepts of fault tolerant computing. Key topics covered are voting, fault detection, clock synchronization, Byzantine Agreement, diagnosis, and reliability analysis. Low level mechanisms such as Hamming codes or low level communications protocols are not covered. The paper is tutorial in nature and does not cover any topic in detail. The focus is on rationale and approach rather than detailed exposition

    Advanced flight control system study

    Get PDF
    A fly by wire flight control system architecture designed for high reliability includes spare sensor and computer elements to permit safe dispatch with failed elements, thereby reducing unscheduled maintenance. A methodology capable of demonstrating that the architecture does achieve the predicted performance characteristics consists of a hierarchy of activities ranging from analytical calculations of system reliability and formal methods of software verification to iron bird testing followed by flight evaluation. Interfacing this architecture to the Lockheed S-3A aircraft for flight test is discussed. This testbed vehicle can be expanded to support flight experiments in advanced aerodynamics, electromechanical actuators, secondary power systems, flight management, new displays, and air traffic control concepts

    Erreichen von Performance in Netzwerken-On-Chip fĂĽr Echtzeitsysteme

    Get PDF
    In many new applications, such as in automatic driving, high performance requirements have reached safety critical real-time systems. Consequently, Networks-on-Chip (NoCs) must efficiently host new sets of highly dynamic workloads e.g., high resolution sensor fusion and data processing, autonomous decision’s making combined with machine learning. The static platform management, as used in current safety critical systems, is no more sufficient to provide the needed level of service. A dynamic platform management could meet the challenge, but it usually suffers from a lack of predictability and the simplicity necessary for certification of safety and real-time properties. In this work, we propose a novel, global and dynamic arbitration for NoCs with real-time QoS requirements. The mechanism decouples the admission control from arbitration in routers thereby simplifying a dynamic adaptation and real-time analysis. Consequently, the proposed solution allows the deployment of a sophisticated contract-based QoS provisioning without introducing complicated and hard to maintain schemes, known from the frequently applied static arbiters. The presented work introduces an overlay network to synchronize transmissions using arbitration units called Resource Managers (RMs), which allows global and work-conserving scheduling. The description of resource allocation strategies is supplemented by protocol design and verification methodology bringing adaptive control to NoC communication in setups with different QoS requirements and traffic classes. For doing that, a formal worst-case timing analysis for the mechanism has been proposed which demonstrates that this solution not only exposes higher performance in simulation but, even more importantly, consistently reaches smaller formally guaranteed worst-case latencies than other strategies for realistic levels of system's utilization. The approach is not limited to a specific network architecture or topology as the mechanism does not require modifications of routers and therefore can be used together with the majority of existing manycore systems. Indeed, the evaluation followed using the generic performance optimized router designs, as well as two systems-on-chip focused on real-time deployments. The results confirmed that the proposed approach proves to exhibit significantly higher average performance in simulation and execution.In vielen neuen sicherheitskritische Anwendungen, wie z.B. dem automatisierten Fahren, werden große Anforderungen an die Leistung von Echtzeitsysteme gestellt. Daher müssen Networks-on-Chip (NoCs) neue, hochdynamische Workloads wie z.B. hochauflösende Sensorfusion und Datenverarbeitung oder autonome Entscheidungsfindung kombiniert mit maschineller Lernen, effizient auf einem System unterbringen. Die Steuerung der zugrunde liegenden NoC-Architektur, muss die Systemsicherheit vor Fehlern, resultierend aus dem dynamischen Verhalten des Systems schützen und gleichzeitig die geforderte Performance bereitstellen. In dieser Arbeit schlagen wir eine neuartige, globale und dynamische Steuerung für NoCs mit Echtzeit QoS Anforderungen vor. Das Schema entkoppelt die Zutrittskontrolle von der Arbitrierung in Routern. Hierdurch wird eine dynamische Anpassung ermöglicht und die Echtzeitanalyse vereinfacht. Der Einsatz einer ausgefeilten vertragsbasierten Ressourcen-Zuweisung wird so ermöglicht, ohne komplexe und schwer wartbare Mechanismen, welche bereits aus dem statischen Plattformmanagement bekannt sind einzuführen. Diese Arbeit stellt ein übergelagertes Netzwerk vor, welches Übertragungen mit Hilfe von Arbitrierungseinheiten, den so genannten Resource Managern (RMs), synchronisiert. Dieses überlagerte Netzwerk ermöglicht eine globale und lasterhaltende Steuerung. Die Beschreibung verschiedener Ressourcenzuweisungstrategien wird ergänzt durch ein Protokolldesign und Methoden zur Verifikation der adaptiven NoC Steuerung mit unterschiedlichen QoS Anforderungen und Verkehrsklassen. Hierfür wird eine formale Worst Case Timing Analyse präsentiert, welche das vorgestellte Verfahren abbildet. Die Resultate bestätitgen, dass die präsentierte Lösung nicht nur eine höhere Performance in der Simulation bietet, sondern auch formal kleinere Worst-Case Latenzen für realistische Systemauslastungen als andere Strategien garantiert. Der vorgestellte Ansatz ist nicht auf eine bestimmte Netzwerkarchitektur oder Topologie beschränkt, da der Mechanismus keine Änderungen an den unterliegenden Routern erfordert und kann daher zusammen mit bestehenden Manycore-Systemen eingesetzt werden. Die Evaluierung erfolgte auf Basis eines leistungsoptimierten Router-Designs sowie zwei auf Echtzeit-Anwendungen fokusierten Platformen. Die Ergebnisse bestätigten, dass der vorgeschlagene Ansatz im Durchschnitt eine deutlich höhere Leistung in der Simulation und Ausführung liefert

    Cost and benefits design optimization model for fault tolerant flight control systems

    Get PDF
    Requirements and specifications for a method of optimizing the design of fault-tolerant flight control systems are provided. Algorithms that could be used for developing new and modifying existing computer programs are also provided, with recommendations for follow-on work

    Determinism Enhancement and Reliability Assessment in Safety Critical AFDX Networks

    Get PDF
    RÉSUMÉ AFDX est une technologie basée sur Ethernet, qui a été développée pour répondre aux défis qui découlent du nombre croissant d’applications qui transmettent des données de criticité variable dans les systèmes modernes d’avionique modulaire intégrée (Integrated Modular Avionics). Cette technologie de sécurité critique a été notamment normalisée dans la partie 7 de la norme ARINC 664, dont le but est de définir un réseau déterministe fournissant des garanties de performance prévisibles. En particulier, AFDX est composé de deux réseaux redondants, qui fournissent la haute fiabilité requise pour assurer son déterminisme. Le déterminisme de AFDX est principalement réalisé par le concept de liens virtuels (Virtual Links), qui définit une connexion unidirectionnelle logique entre les points terminaux (End Systems). Pour les liens virtuels, les limites supérieures des délais de bout en bout peuvent être obtenues en utilisant des approches comme calcul réseau, mieux connu sous l’appellation Network Calculus. Cependant, il a été prouvé que ces limites supérieures sont pessimistes dans de nombreux cas, ce qui peut conduire à une utilisation inefficace des ressources et augmenter la complexité de la conception du réseau. En outre, en raison de l’asynchronisme de leur fonctionnement, il existe plusieurs sources de non-déterminisme dans les réseaux AFDX. Ceci introduit un problème en lien avec la détection des défauts en temps réel. En outre, même si un mécanisme de gestion de la redondance est utilisé pour améliorer la fiabilité des réseaux AFDX, il y a un risque potentiel souligné dans la partie 7 de la norme ARINC 664. La situation citée peut causer une panne en dépit des transmissions redondantes dans certains cas particuliers. Par conséquent, l’objectif de cette thèse est d’améliorer la performance et la fiabilité des réseaux AFDX. Tout d’abord, un mécanisme fondé sur l’insertion de trames est proposé pour renforcer le déterminisme de l’arrivée des trames au sein des réseaux AFDX. Parce que la charge du réseau et la bande passante moyenne utilisée augmente due à l’insertion de trames, une stratégie d’agrégation des Sub-Virtual Links est introduite et formulée comme un problème d’optimisation multi-objectif. En outre, trois algorithmes ont été développés pour résoudre le problème d’optimisation multi-objectif correspondant. Ensuite, une approche est introduite pour incorporer l’analyse de la performance dans l’évaluation de la fiabilité en considérant les violations des délais comme des pannes.----------ABSTRACT AFDX is an Ethernet-based technology that has been developed to meet the challenges due to the growing number of data-intensive applications in modern Integrated Modular Avionics systems. This safety critical technology has been standardized in ARINC 664 Part 7, whose purpose is to define a deterministic network by providing predictable performance guarantees. In particular, AFDX is composed of two redundant networks, which provide the determinism required to obtain the desired high reliability. The determinism of AFDX is mainly achieved by the concept of Virtual Link, which defines a logical unidirectional connection from one source End System to one or more destination End Systems. For Virtual Links, the end-to-end delay upper bounds can be obtained by using the Network Calculus. However, it has been proved that such upper bounds are pessimistic in many cases, which may lead to an inefficient use of resources and aggravate network design complexity. Besides, due to asynchronism, there exists a source of non-determinism in AFDX networks, namely frame arrival uncertainty in a destination End System. This issue introduces a problem in terms of real-time fault detection. Furthermore, although a redundancy management mechanism is employed to enhance the reliability of AFDX networks, there still exist potential risks as pointed out in ARINC 664 Part 7, which may fail redundant transmissions in some special cases. Therefore, the purpose of this thesis is to improve the performance and the reliability of AFDX networks. First, a mechanism based on frame insertion is proposed to enhance the determinism of frame arrival within AFDX networks. As the network load and the average bandwidth used by a Virtual Link increase due to frame insertion, a Sub-Virtual Link aggregation strategy, formulated as a multi-objective optimization problem, is introduced. In addition, three algorithms have been developed to solve the corresponding multi-objective optimization problem. Next, an approach is introduced to incorporate performance analysis into reliability assessment by considering delay violations as failures. This allowed deriving tighter probabilistic upper bounds for Virtual Links that could be applied in AFDX network certification. In order to conduct the necessary reliability analysis, the well-known Fault-Tree Analysis technique is employed and Stochastic Network Calculus is applied to compute the upper bounds with various probability limits
    • …
    corecore