10 research outputs found

    Integration of generic operating systems in partitioned architectures

    Get PDF
    Tese de mestrado, Engenharia Informática (Arquitectura, Sistemas e Redes de Computadores), Universidade de Lisboa, Faculdade de Ciências, 2009The Integrated Modular Avionics (IMA) specification defines a partitioned environment hosting multiple avionics functions of different criticalities on a shared computing platform. ARINC 653, one of the specifications related to the IMA concept, defines a standard interface between the software applications and the underlying operating system. Both these specifications come from the world of civil aviation, but they are getting interest from space industry partners, who have identified common requirements to those of aeronautic applications. Within the scope of this interest, the AIR architecture was defined, under a contract from the European Space Agency (ESA). AIR provides temporal and spatial segregation, and foresees the use of different operating systems in each partition. Temporal segregation is achieved through the fixed cyclic scheduling of computing resources to partitions. The present work extends the foreseen partition operating system (POS) heterogeneity to generic non-real-time operating systems. This was motivated by documented difficulties in porting applications to RTOSs, and by the notion that proper integration of a non-real-time POS will not compromise the timeliness of critical real-time functions. For this purpose, Linux is used as a case study. An embedded variant of Linux is built and evaluated regarding its adequacy as a POS in the AIR architecture. To guarantee safe integration, a solution based on the Linux paravirtualization interface, paravirt-ops, is proposed. In the course of these activities, the AIR architecture definition was also subject to improvements. The most significant one, motivated by the intended increased POS heterogeneity, was the introduction of a new component, the AIR Partition OS Adaptation Layer (PAL). The AIR PAL provides greater POS-independence to the major components of the AIR architecture, easing their independent certification efforts. Other improvements provide enhanced timeliness mechanisms, such as mode-based schedules and process deadline violation monitoring.A especificação Integrated Modular Avionics (IMA) define um ambiente compartimentado com funções de aviónica de diferentes criticalidades a coexistir numa plataforma computacional. A especificação relacionada ARINC 653 define uma interface padrão entre as aplicações e o sistema operativo subjacente. Ambas as especificações provêm do mundo da aviónica, mas estão a ganhar o interesse de parceiros da indústria espacial, que identificaram requisitos em comum entre as aplicações aeronáuticas e espaciais. No âmbito deste interesse, foi definida a arquitectura AIR, sob contrato da Agência Espacial Europeia (ESA). Esta arquitectura fornece segregação temporale espacial, e prevê o uso de diferentes sistemas operativos em cada partição. A segregação temporal é obtida através do escalonamento fixo e cíclico dos recursos às partições. Este trabalho estende a heterogeneidade prevista entre os sistemas operativos das partições (POS). Tal foi motivado pelas dificuldades documentadas em portar aplicações para sistemas operativos de tempo-real, e pela noção de que a integração apropriada de um POS não-tempo-real não comprometerá a pontualidade das funções críticas de tempo-real. Para este efeito, o Linux foi utilizado como caso de estudo. Uma variante embedida de Linux é construída e avaliada quanto à sua adequação como POS na arquitectura AIR. Para garantir uma integração segura, é proposta uma solução baseada na interface de paravirtualização do Linux, paravirt-ops. No decurso destas actividades, foram também feitas melhorias à definição da arquitectura AIR. O mais significante, motivado pelo pretendido aumento da heterogeneidade entre POSs, foi a introdução de um novo componente, AIR Partition OS Adaptation Layer (PAL). Este componente proporciona aos principais componentes da arquitectura AIR maior independência face ao POS, facilitando os esforços para a sua certificação independente. Outros melhoramentos fornecem mecanismos avançados de pontualidade, como mode-based schedules e monitorização de incumprimento de metas temporais de processos.ESA/ITI - European Space Agency Innovation Triangular Initiative (through ESTEC Contract 21217/07/NL/CB-Project AIR-II) and FCT - Fundação para a Ciência e Tecnologia (through the Multiannual Funding Programme

    Real-time scheduling in multicore : time- and space-partitioned architectures

    Get PDF
    Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2014The evolution of computing systems to address size, weight and power consumption (SWaP) has led to the trend of integrating functions (otherwise provided by separate systems) as subsystems of a single system. To cope with the added complexity of developing and validating such a system, these functions are maintained and analyzed as components with clear boundaries and interfaces. In the case of real-time systems, the adopted component-based approach should maintain the timeliness properties of the function inside each individual component, regardless of the remaining components. One approach to this issue is time and space partitioning (TSP)—enforcing strict separation between components in the time and space domains. This allows heterogeneous components (different real-time requirements, criticality, developed by different teams and/or with different technologies) to safely coexist. The concepts of TSP have been adopted in the civil aviation, aerospace, and (to some extent) automotive industries. These industries are also embracing multiprocessor (or multicore) platforms, either with identical or nonidentical processors, but are not taking full advantage thereof because of a lack of support in terms of verification and certification. Furthermore, due to the use of the TSP in those domains, compatibility between TSP and multiprocessor is highly desired. This is not the present case, as the reference TSP-related specifications in the aforementioned industries show limited support to multiprocessor. In this dissertation, we defend that the active exploitation of multiple (possibly non-identical) processor cores can augment the processing capacity of the time- and space-partitioned (TSP) systems, while maintaining a compromise with size, weight and power consumption (SWaP), and open room for supporting self-adaptive behavior. To allow applying our results to a more general class of systems, we analyze TSP systems as a special case of hierarchical scheduling and adopt a compositional analysis methodology.Fundação para a Ciência e a Tecnologia (FCT, SFRH/BD/60193/2009, programa PESSOA, projeto SAPIENT); the European Space Agency Innovation (ESA) Triangle Initiative program through ESTEC Contract 21217/07/NL/CB, Project AIR-II; the European Commission Seventh Framework Programme (FP7) through project KARYON (IST-FP7-STREP-288195)

    Replication of non-deterministic objects

    Get PDF
    This thesis discusses replication of non-deterministic objects in distributed systems to achieve fault tolerance against crash failures. The objects replicated are the virtual nodes of a distributed application. Replication is viewed as an issue that is to be dealt with only during the configuration of a distributed application and that should not affect the development of the application. Hence, replication of virtual nodes should be transparent to the application. Like all measures to achieve fault tolerance, replication introduces redundancy in the system. Not surprisingly, the main difficulty is guaranteeing the consistency of all replicas such that they behave in the same way as if the object was not replicated (replication transparency). This is further complicated if active objects (like virtual nodes) are replicated, and these objects themselves can be clients of still further objects in the distributed application. The problems of replication of active non-deterministic objects are analyzed in the context of distributed Ada 95 applications. The ISO standard for Ada 95 defines a model for distributed execution based on remote procedure calls (RPC). Virtual nodes in Ada 95 use this as their sole communication paradigm, but they may contain tasks to execute activities concurrently, thus making the execution potentially non-deterministic due to implicit timing dependencies. Such non-determinism cannot be avoided by choosing deterministic tasking policies. I present two different approaches to maintain replica consistency despite this non-determinism. In a first approach, I consider the run-time support of Ada 95 as a black box (except for the part handling remote communications). This corresponds to a non-deterministic computation model. I show that replication of non-deterministic virtual nodes requires that remote procedure calls are implemented as nested transactions. Unfortunately, effects of failures are not local to the replicas of a virtual node: when a failure occurs, nested remote calls made to other virtual nodes must be undone. Also, using transactional semantics for RPCs necessitates a compromise regarding transparency: the application must identify global state for it cannot be determined reliably in an automatic way. Further study reveals that this approach cannot be implemented in a transparent way at all because the consistency criterion of Ada 95 (linearizability) is much weaker than that of transactions (serializability). An execution of remote procedure calls as transactions may thus lead to incompatibilities with the semantics of the programming language. If remotely called subprograms on a replicated virtual node perform partial operations, i.e., entry calls on global protected objects, deadlocks that cannot be broken can occur in certain cases. Such deadlocks do not occur when the virtual node is not replicated. The transactional semantics of RPCs must therefore be exposed to the application. A second approach is based on a piecewise deterministic computation model, i.e., the execution of a virtual node is seen as a sequence of deterministic state intervals. Whenever a non-deterministic event occurs, a new state interval is started. I study replica organization under this computation model (semi-active replication). In this model, all non-deterministic decisions are made on one distinguished replica (the leader), while all other replicas (the followers) are forced to follow the same sequence of non-deterministic events. I show that it suffices to synchronize the followers with the leader upon each observable event, i.e., when the leader sends a message to some other virtual node. It is not necessary to synchronize upon each and every non-deterministic event — which would incur a prohibitively high overhead. Non-deterministic events occurring on the leader between observable events are logged and sent to the followers just before the leader executes an observable event. Consequently, it is guaranteed that the followers will reach the same state as the leader, and thus the effects of failures remain mostly local to the replicas. A prototype implementation called RAPIDS (Replicated Ada Partitions In Distributed Systems) serves as a proof of concept for this second approach, demonstrating its feasibility. RAPIDS is an Ada 95 implementation of a replication manager for semi-active replication for the GNAT development system for Ada 95. It is entirely contained within the run-time support and hence largely transparent for the application

    Fault management via dynamic reconfiguration for integrated modular avionics

    Get PDF
    The purpose of this research is to investigate fault management methodologies within Integrated Modular Avionics (IMA) systems, and develop techniques by which the use of dynamic reconfiguration can be implemented to restore higher levels of systems redundancy in the event of a systems fault. A proposed concept of dynamic configuration has been implemented on a test facility that allows controlled injection of common faults to a representative IMA system. This facility allows not only the observation of the response of the system management activities to manage the fault, but also analysis of real time data across the network to ensure distributed control activities are maintained. IMS technologies have evolved as a feasible direction for the next generation of avionic systems. Although federated systems are logical to design, certify and implement, they have some inherent limitations that are not cost beneficial to the customer over long life-cycles of complex systems, and hence the fundamental modular design, i.e. common processors running modular software functions, provides a flexibility in terms of configuration, implementation and upgradability that cannot be matched by well-established federated avionic system architectures. For example, rapid advances of computing technology means that dedicated hardware can become outmoded by component obsolescence which almost inevitably makes replacements unavailable during normal life-cycles of most avionic systems. To replace the obsolete part with a newer design involves a costly re-design and re-certification of any relevant or interacting functions with this unit. As such, aircraft are often known to go through expensive mid-life updates to upgrade all avionics systems. In contrast, a higher frequency of small capability upgrades would maximise the product performance, including cost of development and procurement, in constantly changing platform deployment environments. IMA is by no means a new concept and work has been carried out globally in order to mature the capability. There are even examples where this technology has been implemented as subsystems on service aircraft. However, IMA flexible configuration properties are yet to be exploited to their full extent; it is feasible that identification of faults or failures within the system would lead to the exploitation of these properties in order to dynamically reconfigure and maintain high levels of redundancy in the event of component failure. It is also conceivable to install redundant components such that an IMS can go through a process of graceful degradation, whereby the system accommodates a number of active failures, but can still maintain appropriate levels of reliability and service. This property extends the average maintenance-free operating period, ensuring that the platform has considerably less unscheduled down time and therefore increased availability. The content of this research work involved a number of key activities in order to investigate the feasibility of the issues outlined above. The first was the creation of a representative IMA system and the development of a systems management capability that performs the required configuration controls. The second aspect was the development of hardware test rig in order to facilitate a tangible demonstration of the IMA capability. A representative IMA was created using LabVIEW Embedded Tool Suit (ETS) real time operating system for minimal PC systems. Although this required further code written to perform IMS middleware functions and does not match up to the stringent air safety requirements, it provided a suitable test bed to demonstrate systems management capabilities. The overall IMA was demonstrated with a 100kg scale Maglev vehicle as a test subject. This platform provides a challenging real-time control problem, analogous to an aircraft flight control system, requiring the calculation of parallel control loops at a high sampling rate in order to maintain magnetic suspension. Although the dynamic properties of the test rig are not as complex as a modern aircraft, it has much less stringent operating requirements and therefore substantially less risk associated with failure to provide service. The main research contributions for the PhD are: 1.A solution for the dynamic reconfiguration problem for assigning required systems functions (namely a distributed, real-time control function with redundant processing channels) to available computing resources whilst protecting the functional concurrency and time critical needs of the control actions. 2.A systems management strategy that utilises the dynamic reconfiguration properties of an IMA System to restore high levels of redundancy in the presence of failures. The conclusion summarises the level of success of the implemented system in terms of an appropriate dynamic reconfiguration to the response of a fault signal. In addition, it highlights the issues with using an IMA to as a solution to operational goals of the target hardware, in terms of design and build complexity, overhead and resources

    Proceedings of the Second NASA Formal Methods Symposium

    Get PDF
    This publication contains the proceedings of the Second NASA Formal Methods Symposium sponsored by the National Aeronautics and Space Administration and held in Washington D.C. April 13-15, 2010. Topics covered include: Decision Engines for Software Analysis using Satisfiability Modulo Theories Solvers; Verification and Validation of Flight-Critical Systems; Formal Methods at Intel -- An Overview; Automatic Review of Abstract State Machines by Meta Property Verification; Hardware-independent Proofs of Numerical Programs; Slice-based Formal Specification Measures -- Mapping Coupling and Cohesion Measures to Formal Z; How Formal Methods Impels Discovery: A Short History of an Air Traffic Management Project; A Machine-Checked Proof of A State-Space Construction Algorithm; Automated Assume-Guarantee Reasoning for Omega-Regular Systems and Specifications; Modeling Regular Replacement for String Constraint Solving; Using Integer Clocks to Verify the Timing-Sync Sensor Network Protocol; Can Regulatory Bodies Expect Efficient Help from Formal Methods?; Synthesis of Greedy Algorithms Using Dominance Relations; A New Method for Incremental Testing of Finite State Machines; Verification of Faulty Message Passing Systems with Continuous State Space in PVS; Phase Two Feasibility Study for Software Safety Requirements Analysis Using Model Checking; A Prototype Embedding of Bluespec System Verilog in the PVS Theorem Prover; SimCheck: An Expressive Type System for Simulink; Coverage Metrics for Requirements-Based Testing: Evaluation of Effectiveness; Software Model Checking of ARINC-653 Flight Code with MCP; Evaluation of a Guideline by Formal Modelling of Cruise Control System in Event-B; Formal Verification of Large Software Systems; Symbolic Computation of Strongly Connected Components Using Saturation; Towards the Formal Verification of a Distributed Real-Time Automotive System; Slicing AADL Specifications for Model Checking; Model Checking with Edge-valued Decision Diagrams; and Data-flow based Model Analysis

    Achieving fault tolerance via robust partitioning and N-Modular Redundancy

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2007.Includes bibliographical references (p. 165-169).This thesis describes the design and performance results for the P-NMR fault tolerant avionics system architecture being developed at Draper Laboratory. The two key principles of the architecture are robust software partitioning (P), as defined by the ARINC 653 open standard, and N-Modular Redundancy (NMR). The P-NMR architecture uses cross channel data exchange and voting to implement fault detection, isolation and recovery (FDIR). The FDIR function is implemented in software that executes on commercial-off-the-shelf (COTS) hardware components that are also based on open standards. The FDIR function and the user applications execute on the same processor. The robust partitioning is provided by a COTS real-time operating system that complies with the ARINC 653 standard. A Triple Modular Redundant (TMR) prototype was developed and various performance metrics were collected. Evaluation of the TMR prototype indicates that the ARINC 653 standard is compatible with an NMR and FDIR architecture. Application partitions can be considered software fault containment regions which enhance the overall integrity of the system. The P-NMR performance metrics were compared with a previous Draper Laboratory design called the Fault Tolerant Parallel Processor (FTPP). This design did not make use of robust partitioning and it used proprietary hardware for implementing certain FDIR functions. The comparison demonstrated that the P-NMR system prototype could perform at an acceptable level and that the development of the system should continue. This research was done in the context of developing cost effective avionics systems for space exploration vehicles such as those being developed for NASA's Constellation program.by Brendan Anthony O'Connell.S.M

    Open Multithreaded Transactions: A Transaction Model for Concurrent Object-Oriented Programming

    Get PDF
    To read the abstract, please go to my PhD home page

    Interpartition communication with shared active packages

    No full text
    Concurrent programming on tightly coupled systems is intended to be based on tasks communicating through rendezvous, unprotected shared variables, or protected objects. Concurrent programming on loosely coupled systems is intended to be based on partitions communicating through remote procedure calls, unprotected shared variables, or entry-less protected objects. It may therefore be difficult to port an application from a tightly coupled system to a loosely coupled system since protected objects with entries are not available anymore. This paper presents how protected objects with entries can be implemented through the concept of Shared Active packages
    corecore